How to get data in specific format using scala? - json

I have a raw json in following format-
"luns": [
{
"numReadBlocks": 15444876,
"numWriteBlocks": 13530714,
"blockSizeInBytes": 512,
"writeIops": 495344,
"readIops": 312702,
"serialNumber": "aaaaaaa",
"uuid": "id",
"shareState": "none",
"usedBytes": 6721716224,
"totalSizeBytes": 16106127360,
"path": "/vol/lun_23052014_025830_vol/lun_23052014_025830"
},
{
"numReadBlocks": 15444876,
"numWriteBlocks": 13530714,
"blockSizeInBytes": 512,
"writeIops": 495344,
"readIops": 312702,
"serialNumber": "aaaaaaa",
"uuid": "id",
"shareState": "none",
"usedBytes": 6721716224,
"totalSizeBytes": 16106127360,
"path": "/vol/lun_23052014_025830_vol/lun_23052014_025830"
}]
The luns may contains list.
I want to process above json and form output as following-
"topStorageLuns": [
{
"name": "Free (in GB)",
"data": [7.79,7.79]
},
{
"name": "Used (in GB)",
"data": [7.21,7.21]
}]
I tried following in order to get output-
val storageLuns = myRawJson
val topStorageLuns = storageLuns.map { storageLun =>
val totalLunsSizeOnStorageDevice = storageLun.luns.foldLeft(0.0) {
case (totalBytesOnDevice, lun) =>
totalBytesOnDevice + lun.usedBytes.getOrElse(0.0).toString.toLong
}
val totalAvailableLunsOnStorageDevice = storageLun.luns.foldLeft(0.0) {
case (totalBytesOnDevice, lun) =>
totalBytesOnDevice + lun.usedBytes.getOrElse(0.0).toString.toLong
}
Json.obj("name" -> storageLun.hostId, "data" -> "%.2f".format(totalLunsSizeOnStorageDevice / (1024 * 1024 * 1024)).toDouble)
}
Can anybody help me to get desired output please???

The key lesson I want to impart is that your algorithm should reflect the shape of the output you want. Work backward from the result you want to build the algorithm.
It looks to me like you want to create an array of length 2, where each entry has a corresponding algorithm (spaced used, space free). Within each of these elements, you want a nested array with an element for each item in your input array, calculated using the algorithm from the outer array. Here's how I would approach the problem:
1) Define your algorithms
val dfAlgorithm: (Seq[(String, JsValue)] => Double) = _.foldLeft(0.0) { (acc, item) =>
/* whatever logic you need to do */
}
val duAlgorithm: (Seq[(String, JsValue)] => Double) = _.foldLeft(0.0) { (acc, item) =>
/* whatever logic you need to do */
}
2) Create a data structure to map over to build your final output
val stats = Seq("Free (in GB)" -> dfAlgorithm, "Used (in GB)" -> duAlgorithm)
3) Map over your input data within your mapping over your algorithms (the logic here reflects the shape of the result you want)
stats.map { case (name, algorithm) =>
Json.obj("name" -> name, "data" -> storageLuns.map { storageLun => algorithm(storageLun) }
}
This isn't going to be a turnkey solution, since I don't know how your free/used algorithms are supposed to work, but this overall scheme should get you there.

Related

Karate : Matching data between nested array

Is there a way to match the response data from API which contain a nested array for a key where key-value pair are in different order inside the nested array in karate?
Scenario: Verify original data contains expected data
def original = [{ a:1, b: [{c:2},{d:3}]}]
def expected = [{ b: [{d:3},{c:2}], a:1 }]
Using contains deep method will solve the issue but I am expecting original data from a API response so in some point of time if one more field gets added to the API response, then my scenario will still get passed
Don't try to do everything in one-line. Split your matches, and there is more explanation in the docs:
* def inner = [{ c: 2 }, { d: 3 }]
* def response = [{ a: 1, b: [{ d: 3 }, { c: 2 }]}]
* match each response contains { b: '#(^^inner)' }
* match each response == { a: 1, b: '#(^^inner)' }
* match response[0] == { a: 1, b: '#(^^inner)' }
* match response == [{ a: 1, b: '#(^^inner)' }]
You don't need to use all of these, I'm showing the possible options.

Intersept a FeatureCollection in MongoDB

I have a GeoJson filled with states from Austria and I want to do a query that gives me as output which certain states intercepts my polygon.
This is my query:
db.GeoAustria.find(
{
'features.geometry':{
$geoIntersects:{
$geometry:{
type: "Polygon",
coordinates: [
[
[
16.21685028076172,
48.007381433478855
],
[
16.24225616455078,
47.98716432210271
],
[
16.256675720214844,
48.00669234420252
],
[
16.21685028076172,
48.007381433478855
]
]
]
}
}
}
}
)
But it gives me all the features, including those that don't overlap the polygon...
Where is my mistake in this query?
Basic array match misunderstanding here. The input set is a single doc with 95 polygons in an array in a single FeatureCollection object. When you do a find() on such things, any individual geo that is an intersect will cause the entire doc to be returned as a match. This is exactly the same as:
> db.foo.insert({x:["A","B","C"]})
WriteResult({ "nInserted" : 1 })
> db.foo.find({x:"A"});
{ "_id" : ObjectId("5fb1845b08c09fb8dfe8d1c1"), "x" : [ "A", "B", "C" ] }
The whole doc is returned, not just element "A".
Let's assume that you might have more than one big doc in your collection. This pipeline yields the single target geometry for Baden (I tested it on your input set):
var Xcoords = [
[
[
16.21685028076172,
48.007381433478855
],
[
16.24225616455078,
47.98716432210271
],
[
16.256675720214844,
48.00669234420252
],
[
16.21685028076172,
48.007381433478855
]
]
];
var targ = {type: "Polygon", coordinates: Xcoords};
db.geo1.aggregate([
// First, eliminate any docs where the geometry array has zero intersects. In this
// context, features.geometry means "for each element of array features get the
// geometry field from the object there", almost like saying "features.?.geometry"
{$match: {"features.geometry": {$geoIntersects: {$geometry: targ}} }}
// Next, break up any passing docs of 95 geoms into 95 docs of 1 geom...
,{$unwind: "$features"}
// .. and run THE SAME $match as before to match just the one we are looking for.
// In this context, the array is gone and "features.geometry" means get JUST the
// object named geometry:
,{$match: {"features.geometry": {$geoIntersects: {$geometry: targ}} }}
]);
Beyond this, I might recommend breaking up that FeatureCollection into something that is both indexable (FeatureCollection is NOT indexable in MongoDB) and easier to deal with. For example, this little script run against your single-doc/many-polys design will convert it in 95 docs with extra info:
db.geo2.drop();
mainDoc = db.geo1.findOne(); // the one Austria doc
mainDoc['features'].forEach(function(oneFeature) {
var qq = {
country: "Austria",
crs: mainDoc['crs'],
properties: oneFeature['properties'],
geometry: oneFeature['geometry']
};
db.geo2.insert(qq);
});
db.geo2.aggregate([
{$match: {"geometry": {$geoIntersects: {$geometry: targ}} }}
]);
// yields same single doc output (Baden)
This allows ease of matching and filtering. For more on FeatureCollection vs. GeometryCollection see https://www.moschetti.org/rants/hurricane.html.

How to merge a dynamically named record with a static one in Dhall?

I'm creating an AWS Step Function definition in Dhall. However, I don't know how to create a common structure they use for Choice states such as the example below:
{
"Not": {
"Variable": "$.type",
"StringEquals": "Private"
},
"Next": "Public"
}
The Not is pretty straightforward using mapKey and mapValue. If I define a basic Comparison:
{ Type =
{ Variable : Text
, StringEquals : Optional Text
}
, default =
{ Variable = "foo"
, StringEquals = None Text
}
}
And the types:
let ComparisonType = < And | Or | Not >
And adding a helper function to render the type as Text for the mapKey:
let renderComparisonType = \(comparisonType : ComparisonType )
-> merge
{ And = "And"
, Or = "Or"
, Not = "Not"
}
comparisonType
Then I can use them in a function to generate the record halfway:
let renderRuleComparisons =
\( comparisonType : ComparisonType ) ->
\( comparisons : List ComparisonOperator.Type ) ->
let keyName = renderComparisonType comparisonType
let compare = [ { mapKey = keyName, mapValue = comparisons } ]
in compare
If I run that using:
let rando = ComparisonOperator::{ Variable = "$.name", StringEquals = Some "Cow" }
let comparisons = renderRuleComparisons ComparisonType.Not [ rando ]
in comparisons
Using dhall-to-json, she'll output the first part:
{
"Not": {
"Variable": "$.name",
"StringEquals": "Cow"
}
}
... but I've been struggling to merge that with "Next": "Sup". I've used all the record merges like /\, //, etc. and it keeps giving me various type errors I don't truly understand yet.
First, I'll include an approach that does not type-check as a starting point to motivate the solution:
let rando = ComparisonOperator::{ Variable = "$.name", StringEquals = Some "Cow" }
let comparisons = renderRuleComparisons ComparisonType.Not [ rando ]
in comparisons # toMap { Next = "Public" }
toMap is a keyword that converts records to key-value lists, and # is the list concatenation operator. The Dhall CheatSheet has a few examples of how to use both of them.
The above solution doesn't work because # cannot merge lists with different element types. The left-hand side of the # operator has this type:
comparisons : List { mapKey : Text, mapValue : Comparison.Type }
... whereas the right-hand side of the # operator has this type:
toMap { Next = "Public" } : List { mapKey : Text, mapValue : Text }
... so the two Lists cannot be merged as-is due to the different types for the mapValue field.
There are two ways to resolve this:
Approach 1: Use a union whenever there is a type conflict
Approach 2: Use a weakly-typed JSON representation that can hold arbitrary values
Approach 1 is the simpler solution for this particular example and Approach 2 is the more general solution that can handle really weird JSON schemas.
For Approach 1, dhall-to-json will automatically strip non-empty union constructors (leaving behind the value they were wrapping) when translating to JSON. This means that you can transform both arguments of the # operator to agree on this common type:
List { mapKey : Text, mapValue : < State : Text | Comparison : Comparison.Type > }
... and then you should be able to concatenate the two lists of key-value pairs and dhall-to-json will render them correctly.
There is a second solution for dealing with weakly-typed JSON schemas that you can learn more about here:
Dhall Manual - How to convert an existing YAML configuration file to Dhall
The basic idea is that all of the JSON/YAML integrations recognize and support a weakly-typed JSON representation that can hold arbitrary JSON data, including dictionaries with keys of different shapes (like in your example). You don't even need to convert the entire the expression to this weakly-typed representation; you only need to use this representation for the subset of your configuration where you run into schema issues.
What this means for your example, is that you would change both arguments to the # operator to have this type:
let Prelude = https://prelude.dhall-lang.org/v12.0.0/package.dhall
in List { mapKey : Text, mapValue : Prelude.JSON.Type }
The documentation for Prelude.JSON.Type also has more details on how to use this type.

Retrieving dictionary keys with pre-fixed parent keys using python

I am trying to list all keys with parent keys from a dictionary using python 3. How can I achieve this goal?
Here is so far I did using a recursive function (so that I can use this with any depth of dictionaries).
Here, if I do not use header_prefix, I get all the keys without parent keys. However, when I use header_prefix, it keeps adding parent keys incorrectly to the keys. Basically, I cannot reset header_prefix in the appropriate location.
from pprint import pprint
#%%
data = {
"AWSTemplateFormatVersion": "2010-09-09" ,
"Description": "Stack for MyProject 01",
"Resources": {
"elb01": {
"Type": "AWS::ElasticLoadBalancing::LoadBalancer",
"Properties": {
"CrossZone" : "false",
"HealthCheck" : {
"Target" : "TCP:80",
"Interval" : "20"
},
"ConnectionSettings": {
"IdleTimeout": "120"
}
}
},
"lc01": {
"Type": "AWS::AutoScaling::LaunchConfiguration" ,
"Properties": {
"ImageId" : "ami-01010105" ,
"InstanceType" : "t2.medium"
}
},
"asg01": {
"Type" : "AWS::AutoScaling::AutoScalingGroup",
"Properties" : {
"HealthCheckGracePeriod" : 300,
"HealthCheckType" : "EC2"
}
}
}
}
pprint(data)
#%%
def get_headers(json_data, headers, header_prefix):
for key, value in json_data.items():
if type(value) == dict:
header_prefix = header_prefix + key + '.'
get_headers(value,headers,header_prefix)
else:
headers.append(header_prefix+key)
return(headers)
#%%
header_list = []
prefix = ''
data_headers = get_headers(data, header_list, prefix)
pprint(data_headers)
From the above code, I get the following output:
['AWSTemplateFormatVersion',
'Description',
'Resources.elb01.Type',
'Resources.elb01.Properties.CrossZone',
'Resources.elb01.Properties.HealthCheck.Target',
'Resources.elb01.Properties.HealthCheck.Interval',
'Resources.elb01.Properties.HealthCheck.ConnectionSettings.IdleTimeout',
'Resources.elb01.lc01.Type',
'Resources.elb01.lc01.Properties.ImageId',
'Resources.elb01.lc01.Properties.InstanceType',
'Resources.elb01.lc01.asg01.Type',
'Resources.elb01.lc01.asg01.Properties.HealthCheckGracePeriod',
'Resources.elb01.lc01.asg01.Properties.HealthCheckType']
My expected output is like below:
['AWSTemplateFormatVersion',
'Description',
'Resources.elb01.Type',
'Resources.elb01.Properties.CrossZone',
'Resources.elb01.Properties.HealthCheck.Target',
'Resources.elb01.Properties.HealthCheck.Interval',
'Resources.elb01.Properties.ConnectionSettings.IdleTimeout',
'Resources.lc01.Type',
'Resources.lc01.Properties.ImageId',
'Resources.lc01.Properties.InstanceType',
'Resources.asg01.Type',
'Resources.asg01.Properties.HealthCheckGracePeriod',
'Resources.asg01.Properties.HealthCheckType']
It seems to be a scoping issue. When you modify header_prefix inside the if statement, it modifies it in the function scope and so for all iterations of the loop, leading to the incorrect version being passed to get_headers in later iterations of the loop
In short:
Change
header_prefix = header_prefix + key + '.'
get_headers(value,headers,header_prefix)
To
pfx = header_prefix + key + '.'
get_headers(value,headers,pfx)
This way a new local variable will be created and passed, rather than the header_prefix being updated within the function scope.
(any variable name that's not used within the get_headers function will do

R - Create JSON for Adobe Analytics API call - define object conditionally

I have following code to create a JSON for making a call to Adobe Analytics API (method segment.save)
item <-
list(definition = list(
container = list (
type = "hits",
operator = "or",
rules=I(list(
list(value= "test1 test2",
operator = "contains_any",
element = "page")))
)
),
owner="test",
reportSuiteID="test",
description="API Generated Segment",
name="test segment"
)
Once prettyfied and auto-unboxed, the result is:
> jsonlite::toJSON(item, pretty = T, auto_unbox= T)
{
"definition": {
"container": {
"type": "hits",
"operator": "or",
"rules": [
{
"value": "test1 test2",
"operator": "contains_any",
"element": "page"
}
]
}
},
"owner": "test",
"reportSuiteID": "test",
"description": "API Generated Segment",
"name": "test segment"
}
Good for creating new segments, but not so good for editing them
The JSON structure is valid, as I am able to create the new segment. However, I would like to check if the segment already exists (using f.i. the GetSegments() function from randyzwitch RSiteCatalyst package and check if name coincides already with a created segment). If the segment already exists, I want to pass the id to the API call, which is the method used for editing already existing segments. It should then look like:
> jsonlite::toJSON(item, pretty = T, auto_unbox= T)
{
"definition": {
...
},
"owner": "test",
"reportSuiteID": "test",
"description": "API Generated Segment",
"name": "test segment",
"id": "s1982XXXXXXXXX_XXXXX_XXXXX",
}
It is possible to make an if alike statement within the list() definition provided in the first piece of code? I would like to reach a solution that does not need an if statement that checks if segmentID exists and, depending on it, generates a call with id or a call without id.
Once a "JSON alike structure" is created using list function:
item <-
list(definition = list(
container = list (
type = "hits",
operator = "or",
rules=I(list(
list(value= "test1 test2",
operator = "contains_any",
element = "page")))
)
),
owner="test",
reportSuiteID="test",
description="API Generated Segment",
name="test segment"
)
We can push new elements to this list using the needed conditions. For example, if we have our segment IDs in a dataframe with name segments, we can push this ID to item this way:
if (!is.na(segments$segmentID[i])) {
item <- c(item, id=segments$segmentID[i])
}