I have one JSON object.
def secretJson = readJSON text: secret
def props = secretJson.SecretString
println secretJson.SecretString
Output:
{"Name":"welcome","Env":"test","Region":"us-east-1","Cloud":"AWS"}
can we append the key and value to this object?
for example:
{"Name":"welcome","Env":"test","Region":"us-east-1","Cloud":"AWS","readOnlyHost":"value"}
Related
I'm consuming a REST API in Databricks using PySpark. The API response returns a list where each element of the list is a JSON string. When I parallelize the JSON, it yields a _corrupt_record column where each value of that column is a JSON string:
### API Call
response = requests.get(api_url, headers=api_call_header)
api_json = response.json()
df = spark.read.json(sc.parallelize(api_json))
display(df)
This is what the JSON string of a single value looks like when I copy it into a JSON validator:
{
'Var1': 'String',
'Var2': {
'Var3': 'String',
'Var4': None,
'Var5': 'String',
'Var6': 'String',
'Var7': 'String',
'Var8': 'String'
},
'Var9': None,
'Var10': 'String'
}
For whatever reason, I can't access the nested Struct objects of Var2. When I use the from_json function and the following from-scratch schema, it yields NULL values from Var2 onward:
schema = StructType([
StructField('Var1', StringType()),
StructField('Var2',
StructType([
StructField('Var3', StringType()),
StructField('Var4', NullType()),
StructField('Var5', StringType()),
StructField('Var6', StringType()),
StructField('Var7', StringType()),
StructField('Var8', StringType())
])
),
StructField('Var9', NullType()),
StructField('Var10', StringType())
])
This is my code attempting to parse the JSON string:
df = df.withColumn('struct_json', from_json(col('_corrupt_record'), schema))
That parses the first key:value pair but yields the rest of the column values as NULL:
*object:*
Var1: "String"
Var2: NULL
Var3: NULL
Var4: NULL
Var5: NULL
Var6: NULL
Var7: NULL
Var8: NULL
Var9: NULL
Var10: NULL
Any help would be much appreciated!
Attempted Solutions:
JSON Schema from Scratch - As mentioned above, it yields NULL values.
multiLine=True and allowSingleQuotes=True Read Options - Found this in another StackOverflow post but it still yielded NULL values when using my from-scratch JSON schema.
JSON Schema Using rdd.map Method - I tried to derive a schema using json_schema = spark.read.json(df.rdd.map(lambda row: row._corrupt_record)).schema but that simply created a one-layer Struct object where the layer consisted of the entire JSON string without any nested objects parsed out.
SQL to Parse Key:Value Pairs - Too many nested objects and arrays to successfully parse and yielded too poor performance.
The answer to this was embarrassingly simple:
From the API call, api_json = response.json() creates a Python dictionary. This was confirmed doing type(api_json).
Creating a DataFrame using the spark.read.json method was incorrect since the source api_json data was a dictionary not a JSON.
So the fix was changing this:
response = requests.get(api_url, headers=api_call_header)
api_json = response.json()
df = spark.read.json(sc.parallelize(api_json))
display(df)
To this:
response = requests.get(api_url, headers=api_call_header)
api_json = response.json()
df = spark.createDataFrame(api_json, schema=schema)
display(df)
For the schema, I used the one I had built from scratch in PySpark.
I am not able to serialize a custom object in python 3 .
Below is the explanation and code
packages= []
data = {
"simplefield": "value1",
"complexfield": packages,
}
where packages is a list of custom object Library.
Library object is a class as below (I have also sub-classed json.JSONEncoder but it is not helping )
class Library(json.JSONEncoder):
def __init__(self, name):
print("calling Library constructor ")
self.name = name
def default(self, object):
print("calling default method ")
if isinstance(object, Library):
return {
"name": object.name
}
else:
raiseExceptions("Object is not instance of Library")
Now I am calling json.dumps(data) but it is throwing below exception.
TypeError: Object of type `Library` is not JSON serializable
It seems "calling default method" is not printed meaning, Library.default method is not called
Can anybody please help here ?
I have also referred to Serializing class instance to JSON but its is not helping much
Inheriting json.JSONEncoder will not make a class serializable. You should define your encoder separately and then provide the encoder instance in the json.dumps call.
See an example approach below
import json
from typing import Any
class Library:
def __init__(self, name):
print("calling Library constructor ")
self.name = name
def __repr__(self):
# I have just created a serialized version of the object
return '{"name": "%s"}' % self.name
# Create your custom encoder
class CustomEncoder(json.JSONEncoder):
def default(self, o: Any) -> Any:
# Encode differently for different types
return str(o) if isinstance(o, Library) else super().default(o)
packages = [Library("package1"), Library("package2")]
data = {
"simplefield": "value1",
"complexfield": packages,
}
#Use your encoder while serializing
print(json.dumps(data, cls=CustomEncoder))
Output was like:
{"simplefield": "value1", "complexfield": ["{\"name\": \"package1\"}", "{\"name\": \"package2\"}"]}
You can use the default parameter of json.dump:
def default(obj):
if isinstance(obj, Library):
return {"name": obj.name}
raise TypeError(f"Can't serialize {obj!r}.")
json.dumps(data, default=default)
Register a csv in django-admin and through an action in django-admin to convert to json and store the value in a JSONField
However, in action django-admin I'm getting this error and I can't convert it to json...
admin.py
(....)
def read_data_csv(path):
with open(path, newline='') as csvfile:
reader = csv.DictReader(csvfile)
data = []
for row in reader:
data.append(dict(row))
return data
def convert(modeladmin, request, queryset):
for extraction in queryset:
csv_file_path = extraction.lawsuits
read_data_csv(csv_file_path)
Error:
TypeError at /admin/core/extraction/
expected str, bytes or os.PathLike object, not FieldFile
This is about extraction.lawsuits which is a FieldFile instance. Just pass read_data_csv function csv_file_path.path as argument. That should work.
def json = '{"book": [{"id": "01","language": "Java","edition": "third","author": "Herbert Schildt"},{"id": "07","language": "C++","edition": "second","author": "E.Balagurusamy"}]}'
Using Groovy code, how to get the "id" values printed for "book" array?
Output:
[01, 07]
This is the working example using your input JSON.
import groovy.json.*
def json = '''{"book": [
{"id": "01","language": "Java","edition": "third","author": "Herbert Schildt"},
{"id": "07","language": "C++","edition": "second","author": "E.Balagurusamy"}
]
}'''
def jsonObj = new JsonSlurper().parseText(json)
println jsonObj.book.id // This will return the list of all values of matching key.
Demo here on groovy console : https://groovyconsole.appspot.com/script/5178866532352000
I am using jsr223 assertion with groovy script ,I am saving the parsed response as variable
def slurper = new groovy.json.JsonSlurper();
def t1= prev.getResponseDataAsString();
def response = slurper.parseText(t1);
vars.putObject("Summary", response);
Now I want to use this Summary variable in another call, so that I can assert it
def nn = ${SummaryJDBC};
But I am getting this error
jmeter.threads.JMeterThread: Error while processing sampler
'Competitive_Landscape(Past_awardees)' : java.lang.ClassCastException:
java.util.ArrayList cannot be cast to java.lang.String
You should use the getObject() method:
nn = vars.getObject("SummaryJDBC")