This is the script
def validate_record_schema(record):
device = record.get('Payload', {})
manual_added= device.get('ManualAdded', None)
location = device.get('Location', None)
if isinstance(manual_added, dict) and isinstance(location, dict):
if 'Value' in manual_added and 'Value' in location:
return False
return isinstance(manual_added, bool) and isinstance(location, str)
print([validate_record_schema(r) for r in data])
This is json data
data = [{
"Id": "12",
"Type": "DevicePropertyChangedEvent",
"Payload": [{
"DeviceType": "producttype",
"DeviceId": 2,
"IsFast": false,
"Payload": {
"DeviceInstanceId": 2,
"IsResetNeeded": false,
"ProductType": "product",
"Product": {
"Family": "home"
},
"Device": {
"DeviceFirmwareUpdate": {
"DeviceUpdateStatus": null,
"DeviceUpdateInProgress": null,
"DeviceUpdateProgress": null,
"LastDeviceUpdateId": null
},
"ManualAdded": {
"value":false
},
"Name": {
"Value": "Jigital60asew",
"IsUnique": true
},
"State": null,
"Location": {
"value":"bangalore"
},
"Serial": null,
"Version": "2.0.1.100"
}
}
}]
}]
For the line device = device.get('ManualAdded', None), I am getting the following error: AttributeError: 'list' object has no attribute 'get'.
please have a look and help me to solve this issue
Where i am doing mistake...
How can i fix this error?
Please help me to solve this issue
You are having problems tracking types as you traverse data. One trick is to add prints along the way for debug to see what is going on. For instance, that top "Payload" object is a list of dict, not a single dict. The list implies that you can have more than one device descriptor so I wrote a sample that checks all of them and returns False if it finds something wrong along the way. you will likely need to update this according to your validation rules, but this will get you started.
def validate_record_schema(record):
"""Validate that the 0 or more Payload dicts in record
use proper types"""
err_path = "root"
try:
for device in record.get('Payload', []):
payload = device.get('Payload', None)
if payload is None:
# its okay to have device without payload?
continue
device = payload["Device"]
if not isinstance(device["ManualAdded"]["value"], bool):
return False
if not isinstance(device["Location"]["value"], str):
return False
except KeyError as e:
print("missing key")
return False
return True
As the error suggests, you can't .get() on a list. To get the Location and ManualAdded field, you could use:
manual_added = record.get('Payload')[0].get('Payload').get('Device').get('ManualAdded')
location = record.get('Payload')[0].get('Payload').get('Device').get('Location')
So your function would become:
def validate_record_schema(record):
manual_added = record.get('Payload')[0].get('Payload').get('Device').get('ManualAdded')
location = record.get('Payload')[0].get('Payload').get('Device').get('Location')
if isinstance(manual_added, dict) and isinstance(location, dict):
if 'Value' in manual_added and 'Value' in location:
return False
return isinstance(manual_added, bool) and isinstance(location, str)
Note that this would set location to
{
"value":"bangalore"
}
and manual_added to
{
"value":false
}
Related
I am currently working on a walking json where I could with parameters add in the walk I want to go through JSON. I have created something like this:
from collections import abc
def walk(obj, *path):
"""
Goes through the given json path. If it is found then return the given path else empty dict
"""
for segment in path:
if not isinstance(obj, abc.Mapping) or segment not in obj:
print(f"Couldn't walk path; {path}")
return {}
obj = obj[segment]
return obj
# -------------------------------------------------------- #
json_value = {
"id": "dc932304-dde4-3517-8b76-58081cc9dd0d",
"Information": [{
"merch": {
"id": "8fb66657-b93d-5f2d-8fe7-a5e355f0f3a8",
"status": "ACTIVE"
},
"value": {
"country": "SE IS BEST"
},
"View": {
"id": "9aae10f4-1b75-481d-ac5f-b17bc46675bd"
}
}],
"collectionTermIds": [
],
"resourceType": "thread",
"rollup": {
"totalThreads": 1,
"threads": [
]
},
"collectionsv2": {
"groupedCollectionTermIds": {
},
"collectionTermIds": [
]
}
}
# -------------------------------------------------------- #
t = walk(json_value, "Information", 0)
print(t)
My current problem is that I am trying to get the the first in a list from "Information" by giving the walk function the value 0 as I provided however it returns that it couldn't due to Couldn't walk path; ('Information', 0)
I wonder how I can choose which list number I want to walk through by giving it into the parameter? e.g. if I would choose 1, it should return Couldn't walk path; ('Information', 1) but if I choose to do 0 then it should return
{
"merch": {
"id": "8fb66657-b93d-5f2d-8fe7-a5e355f0f3a8",
"status": "ACTIVE"
},
"value": {
"country": "SE IS BEST"
},
"View": {
"id": "9aae10f4-1b75-481d-ac5f-b17bc46675bd"
}
}
This should work for any JSON object:
def walk(obj, *path):
"""
Goes through the given json path. If it is found then return the given path else empty dict
"""
try:
segment, key, *rest = path
except ValueError:
# We are simply passed in a single key, ex. walk(obj, 'my_key')
key = path[0]
try:
return obj[key]
except KeyError:
print(f"Couldn't walk path: {key!r}\n"
f" obj={obj!r}\n"
" reason=key missing from object")
return {}
except TypeError:
print(f"Couldn't walk path: {key!r}\n"
f" obj={obj!r}\n"
" reason=object is not a mapping or array type.")
return {}
except IndexError as e:
print(f"Couldn't walk path: {key!r}\n"
f" obj={obj!r}\n"
f" reason={e}")
return {}
# This indicates we want to walk down a nested (or deeply nested) path. We
# can use recursion to solve this case.
try:
inner_obj = obj[segment]
except KeyError:
print(f"Couldn't walk path: {segment!r} -> {key!r}\n"
f" obj={obj!r}\n"
" reason=key missing from object")
return {}
except IndexError as e:
print(f"Couldn't walk path: {segment!r} -> {key!r}\n"
f" obj={obj!r}\n"
f" reason={e}")
return {}
else:
return walk(inner_obj, key, *rest)
that could probably be optimized a bit, for example by removing the duplicate except blocks with a slight modification.
code for testing (using the json_value from above):
t = walk(json_value, 'Information', 1)
assert t == {}
t = walk(json_value, 'Information', 1, 'value', 'country')
assert t == {}
t = walk(json_value, 'Information', 0)
print(t)
# {'merch': {'id': '8fb66657-b93d-5f2d-8fe7-a5e355f0f3a8', 'status': 'ACTIVE'}, 'value': {'country': 'SE IS BEST'}, 'View': {'id': '9aae10f4-1b75-481d-ac5f-b17bc46675bd'}}
t = walk(json_value, 'Information', 0, 'value', 'country')
print(t)
# SE IS BEST
t = walk(json_value, 'collectionsv2', 'collectionTermIds')
print(t)
# []
t = walk(json_value, 'id')
print(t)
# dc932304-dde4-3517-8b76-58081cc9dd0d
t = walk(json_value, 'id', 'test')
assert t == {}
# error: wrong type
I'm working with a REST API that returns data in the following format:
{
"id": "2902cbad6da44459ad05abd1305eed14",
"displayName": "",
"sourceHost": "dev01.test.lan",
"sourceIP": "192.168.145.1",
"messagesPerSecond": 0,
"messages": 2733,
"size": 292062,
"archiveSize": 0,
"dates": [
{
"date": 1624921200000,
"messages": 279,
"size": 29753,
"archiveSize": 0
},
{
"date": 1625007600000,
"messages": 401,
"size": 42902,
"archiveSize": 0
}
]
}
I'm using json.loads to successfully pull the data from the API, and I now need to search for a particular "date:" value and read the corresponding "messages", "size" and "archiveSize" values.
I'm trying to use the "if-in" method to find the value I'm interested in, for example:
response = requests.request("GET", apiQuery, headers=headers, data=payload)
json_response = json.loads(response.text)
test = 2733
if test in json_response.values():
print(f"Yes, value: '{test}' exist in dictionary")
else:
print(f"No, value: '{test}' does not exist in dictionary")
This works fine for any value in the top section of the JSON return, but it never finds any values in the "dates" sub-branches.
I have two questions, firstly, how do I find the target "date" value? Secondly, once I find that "sub-branch" what would be the best way to extract the three values I need?
Thanks.
from json import load
def list_dates_whose_message_count_equals(dates=None, message_count=0):
return list(filter(
lambda date: date.get("messages") == message_count, dates
))
def main():
json_ = {}
with open("values.json", "r") as fp:
json_ = load(fp)
print(list_dates_whose_message_count_equals(json_["dates"], message_count=279))
print(list_dates_whose_message_count_equals(json_["dates"], message_count=401))
if __name__ == "__main__":
main()
Returns this
[{'date': 1624921200000, 'messages': 279, 'size': 29753, 'archiveSize': 0}]
[{'date': 1625007600000, 'messages': 401, 'size': 42902, 'archiveSize': 0}]
Please help with parse CSV to JSON from 2 CSV Files in groovy
For example :
CSV1:
testKey,status
Name001,PASS
Name002,PASS
Name003,FAIL
CSV2:
Kt,Pd
PT-01,Name001
PT-02,Name002
PT-03,Name003
PT-04,Name004
I want to input in "testlist" data from CSV2.val[1..-1],CSV1.val[1..-1]
Result should be like :
{
"testExecutionKey": "DEMO-303",
"info": {
"user": "admin"
},
"tests": [
{
"TestKey": "PT-01",
"status": "PASS"
},
{
"TestKey": "PT-02",
"status": "PASS"
},
{
"TestKey": "PT-03",
"status": "FAIL"
}
]
code without this modification (from only 1 csv):
import groovy.json.*
def kindaFile = '''
TestKey;Finished;user;status
Name001;PASS;
Name002;PASS;
'''.trim()
def keys
def testList = []
//parse CSV
kindaFile.splitEachLine( /;/ ){ parts ->
if( !keys )
keys = parts
else{
def test = [:]
parts.eachWithIndex{ val, ix -> test[ keys[ ix ] ] = val }
testList << test
}
}
def builder = new JsonBuilder()
def root = builder {
testExecutionKey 'DEMO-303'
info user: 'admin'
tests testList
}
println JsonOutput.prettyPrint(JsonOutput.toJson(root))
Your sample JSON doesn't match the CSV definition. It looks lile you're using fields [1..-1] from CSV 1, as you stated, but fields [0..-2] from CSV 2. As you only have 2 fields in each CSV that's the equivalent of csv1[1] and csv2[0]. The example below uses [0..-2]. Note that if you always have exactly two fields in your input files then the following code could be simplified a little. I've given a more generic solution that can cope with more fields.
Load both CSV files into lists
File csv1 = new File( 'one.csv')
File csv2 = new File( 'two.csv')
def lines1 = csv1.readLines()
def lines2 = csv2.readLines()
assert lines1.size() <= lines2.size()
Note the assert. That's there as I noticed you have 4 tests in CSV2 but only 3 in CSV1. To allow the code to work with your sample data, it iterates through through CSV1 and adds the matching data from CSV2.
Get the field names
fieldSep = /,[ ]*/
def fieldNames1 = lines1[0].split( fieldSep )
def fieldNames2 = lines1[0].split( fieldSep )
Build the testList collection
def testList = []
lines1[1..-1].eachWithIndex { csv1Line, lineNo ->
def mappedLine = [:]
def fieldsCsv1 = csv1Line.split( fieldSep )
fieldsCsv1[1..-1].eachWithIndex { value, fldNo ->
String name = fieldNames1[ fldNo + 1 ]
mappedLine[ name ] = value
}
def fieldsCsv2 = lines2[lineNo + 1].split( fieldSep )
fieldsCsv2[0..-2].eachWithIndex { value, fldNo ->
String name = fieldNames2[ fldNo ]
mappedLine[ name ] = value
}
testList << mappedLine
}
Parsing
You can now parse the list of maps with your existing code. I've made a change to the way the JSON string is displayed though.
def builder = new JsonBuilder()
def root = builder {
testExecutionKey 'DEMO-303'
info user: 'admin'
tests testList
}
println builder.toPrettyString()
JSON Output
Running the above code, using your CSV1 and CSV 2 data, gives the JSON that you desire.
for CSV1:
testKey,status
Name001,PASS
Name002,PASS
Name003,FAIL
and CSV2:
Kt,Pd
PT-01,Name007
PT-02,Name001
PT-03,Name003
PT-05,Name002
PT-06,Name004
PT-07,Name006
result is:
{
"testExecutionKey": "DEMO-303",
"info": {
"user": "admin"
},
"tests": [
{
"status": "PASS",
"testKey": "PT-01"
},
{
"status": "PASS",
"testKey": "PT-02"
},
{
"status": "FAIL",
"testKey": "PT-03"
}
]
}
but I need exactly the same values for testKey (testKey from CSV1=Kt from CSV2)
{
"testExecutionKey": "DEMO-303",
"info": {
"user": "admin"
},
"tests": [
{
"testKey": "PT-02",
"status": "PASS"
},
{
"testKey": "PT-05",
"status": "PASS"
},
{
"testKey": "PT-03",
"status": "FAIL"
}
]
}
I'm getting a JSON object over the network, as a String. I'm then using Circe to parse it. I want to add a handful of fields to it, and then pass it on downstream.
Almost all of that works.
The problem is that my "adding" is really "overwriting". That's actually ok, as long as I add an empty object first. How can I add such an empty object?
So looking at the code below, I am overwriting "sometimes_empty:{}" and it works. But because sometimes_empty is not always empty, it results in some data loss. I'd like to add a field like: "custom:{}" and then ovewrite the value of custom with my existing code.
Two StackOverflow posts were helpful. One worked, but wasn't quite what I was looking for. The other I couldn't get to work.
1: Modifying a JSON array in Scala with circe
2: Adding field to a json using Circe
val js: String = """
{
"id": "19",
"type": "Party",
"field": {
"id": 1482,
"name": "Anne Party",
"url": "https"
},
"sometimes_empty": {
},
"bool": true,
"timestamp": "2018-12-18T11:39:18Z"
}
"""
val newJson = parse(js).toOption
.flatMap { doc =>
doc.hcursor
.downField("sometimes_empty")
.withFocus(_ =>
Json.fromFields(
Seq(
("myUrl", Json.fromString(myUrl)),
("valueZ", Json.fromString(valueZ)),
("valueQ", Json.fromString(valueQ)),
("balloons", Json.fromString(balloons))
)
)
)
.top
}
newJson match {
case Some(v) => return v.toString
case None => println("Failure!")
}
We need to do a couple of things. First, we need to zoom in on the specific property we want to update, if it doesn't exist, we'll create a new empty one. Then, we turn the zoomed in property in the form of a Json into JsonObject in order to be able to modify it using the +: method. Once we've done that, we need to take the updated property and re-introduce it in the original parsed JSON to get the complete result:
import io.circe.{Json, JsonObject, parser}
import io.circe.syntax._
object JsonTest {
def main(args: Array[String]): Unit = {
val js: String =
"""
|{
| "id": "19",
| "type": "Party",
| "field": {
| "id": 1482,
| "name": "Anne Party",
| "url": "https"
| },
| "bool": true,
| "timestamp": "2018-12-18T11:39:18Z"
|}
""".stripMargin
val maybeAppendedJson =
for {
json <- parser.parse(js).toOption
sometimesEmpty <- json.hcursor
.downField("sometimes_empty")
.focus
.orElse(Option(Json.fromJsonObject(JsonObject.empty)))
jsonObject <- json.asObject
emptyFieldJson <- sometimesEmpty.asObject
appendedField = emptyFieldJson.+:("added", Json.fromBoolean(true))
res = jsonObject.+:("sometimes_empty", appendedField.asJson)
} yield res
maybeAppendedJson.foreach(obj => println(obj.asJson.spaces2))
}
}
Yields:
{
"id" : "19",
"type" : "Party",
"field" : {
"id" : 1482,
"name" : "Anne Party",
"url" : "https"
},
"sometimes_empty" : {
"added" : true,
"someProperty" : true
},
"bool" : true,
"timestamp" : "2018-12-18T11:39:18Z"
}
I have the following Json block that I have returned as a JsObject
{
"first_block": [
{
"name": "demo",
"description": "first demo description"
}
],
"second_block": [
{
"name": "second_demo",
"description": "second demo description",
"nested_second": [
{
"name": "bob",
"value": null
},
{
"name": "john",
"value": null
}
]
}
]
}
From this, I want to return a list of all the possible values I could have in the second block, nested array for name and value. so with the example above
List([bob,null],[john,null]) or something along those lines.
The issue I am having is with the value section understanding null values. I've tried to match against it and return a string "null" but I can't get it to match on Null values.
What would be the best way for me to return back the name and values in the nested_second array.
I've tried using case classes and readAsNullable with no luck, and my latest attempt has gone along these lines:
val secondBlock = (jsObj \ "second_block").as[List[JsValue]]
secondBlock.foreach(nested_block => {
val nestedBlock = (nested_block \ "nested_second").as[List[JsValue]]
nestedBlock.foreach(value => {
val name = (value \ "name").as[String] //always a string
var convertedValue = ""
val replacement_value = value \ "value"
replacement_value match {
case JsDefined(null) => convertedValue = "null"
case _ => convertedValue = replacement_value.as[String]
}
println(name)
println(convertedValue)
})
}
)
It seems convertedValue returns as 'JsDefined(null)' regardless and I'm sure the way I'm doing it is horrifically bad.
Replace JsDefined(null) with JsDefined(JsNull).
You probably got confused, because println(JsDefined(JsNull)) prints as JsDefined(null). But that is not, how null value of a JSON field is represented. null is represented as case object JsNull. This is just a good API design, where possible cases are represented with a hierarchy of classes:
With play-json I use always case-classes!
I simplified your problem to the essence:
import play.api.libs.json._
val jsonStr = """[
{
"name": "bob",
"value": null
},
{
"name": "john",
"value": "aValue"
},
{
"name": "john",
"value": null
}
]"""
Define a case class
case class Element(name: String, value: Option[String])
Add a formatter in the companion object:
object Element {
implicit val jsonFormat: Format[Element] = Json.format[Element]
}
An use validate:
Json.parse(jsonStr).validate[Seq[Element]] match {
case JsSuccess(elems, _) => println(elems)
case other => println(s"Handle exception $other")
}
This returns: List(Element(bob,None), Element(john,Some(aValue)), Element(john,None))
Now you can do whatever you want with the values.