This is driving me nuts, and I don't understand what's wrong with my approach.
I generate a JSON object in SQL like this:
select #output = (
select distinct lngEmpNo, txtFullName
from tblSecret
for json path, root('result'), include_null_values
)
I get a result like this:
{"result":[{"lngEmpNo":696969,"txtFullName":"Clinton, Bill"}]}
ISJSON() confirms that it's valid JSON, and JSON_QUERY(#OUTPUT, '$.result') will return the array [] portion of the JSON object... cool!
BUT, I'm trying to use JSON_QUERY to extract a specific value:
This gets me a NULL value. Why??????? I've tried it with the [0], without the [0], and of course, txtFullName[0]
SELECT JSON_QUERY(#jsonResponse, '$.result[0].txtFullName');
I prefixed with strict, SELECT JSON_QUERY(#jsonResponse, 'strict $.result[0].txtFullName');, and it tells me this:
Msg 13607, Level 16, State 4, Line 29
JSON path is not properly formatted. Unexpected character 't' is found at
position 18.
What am I doing wrong? What is wrong with my structure?
JSON_QUERY will only extract an object or an array. You are trying to extract a single value so, you need to use JSON_VALUE. For example:
SELECT JSON_VALUE(#jsonResponse, '$.result[0].txtFullName');
I got a json file similar to this.
"code": 298484,
"details": {
"date": "0001-01-01",
"code" : 0
}
code appears twice, one is filled and the other one is empty. I need the first one with the data in details. What is the approach in pyspark?
I tried to filter
df = rdd.map(lambda r: (r['code'], r['details'])).toDF()
But it shows _1, _2 (no schema).
Please try the following:
spark.read.json("path to json").select("code", "details.date")
I am trying to extract json file data using python but running in some errors.
aircraft.json (json file):
{ "now" : 1609298440.3,
"messages" : 31501,
"aircraft" : [
{"hex":"abadf9","alt_baro":37000,"alt_geom":36625,"gs":541.9,"track":73.3,"baro_rate":0,"version":0,"nac_p":7,"nac_v":1,"sil":2,"sil_type":"unknown","mlat":[],"tisb":[],"messages":13,"seen":6.6,"rssi":-25.3},
{"hex":"acc02b","flight":"SWA312 ","alt_baro":37000,"alt_geom":36650,"gs":549.3,"track":62.2,"baro_rate":0,"category":"A3","nav_qnh":1013.6,"nav_altitude_mcp":36992,"nav_heading":56.2,"lat":42.171346,"lon":-93.298198,"nic":8,"rc":186,"seen_pos":66.3,"version":2,"nic_baro":1,"nac_p":8,"nac_v":1,"sil":3,"sil_type":"perhour","gva":1,"sda":2,"mlat":[],"tisb":[],"messages":1205,"seen":7.4,"rssi":-26.0},
{"hex":"ac9e9a","category":"A4","version":2,"sil_type":"perhour","mlat":[],"tisb":[],"messages":746,"seen":119.1,"rssi":-26.6},
{"hex":"a96577","flight":"DAL673 ","alt_baro":40025,"alt_geom":39625,"gs":371.4,"track":265.1,"baro_rate":0,"squawk":"2641","emergency":"none","category":"A4","nav_qnh":1013.6,"nav_altitude_mcp":40000,"nav_heading":258.8,"lat":42.057220,"lon":-94.098337,"nic":8,"rc":186,"seen_pos":0.9,"version":2,"nic_baro":1,"nac_p":9,"nac_v":1,"sil":3,"sil_type":"perhour","gva":2,"sda":2,"mlat":[],"tisb":[],"messages":3021,"seen":0.3,"rssi":-21.8},
{"hex":"aa56db","category":"A3","version":2,"sil_type":"perhour","mlat":[],"tisb":[],"messages":1651,"seen":85.3,"rssi":-26.4}
]
}
My code:
import json
json_file = open('test.json')
aircraft_json = json.load(json_file)
for i in aircraft_json['aircraft']:
print(i['hex'],i['flight'],i['alt_baro'],i['alt_geom'],i['gs'],i['gs'],i['track'],i['baro_rate'],i[
'category'],i['nav_qnh'],i['nav_altitude_mcp'],i['lat'],i['lon'],i['nic'],i['rc'],i['seen_pos'],i['version'],i['nic_baro'],i['nac_p'],i['nac_v'],i['sil'],i['sil_type'],i['gva'],i['sda'],i['mlat'],i['tisb'],i['messages'],i['seen'],i['rssi'])
json_file.close()
Output:
Traceback (most recent call last):
File "/home/pi/aircraft_json_to_csv.py", line 11, in <module>
print(i['hex'],i['flight'],i['alt_baro'],i['alt_geom'],i['gs'],i['gs'],i['track'],i['baro_rate'],i[
KeyError: 'flight
The json file is updated every second and json file may miss key values like 'flight' or any random key values. My question is if that key is missing then how to replace those missing value with empty space without getting keyerror.
Thank you
My advice would be to give each field a suitable default value and store these fields in a dictionary.
Then, instead of assuming the field is present, check if the field exists. If it doesn't, then apply the default value.
Below is a simple example of this in action.
The defaults dict has been populated with a few possible defaults
to get you started, to which you would add the rest of the fields as well.
I've adapted the loop to iterate through the keys of the dict (all the known fields so to speak), and add the default value for any missing field.
import json
with open('aircraft.json') as json_file:
aircraft_json = json.load(json_file)
defaults = {
'alt_baro': 0,
'alt_geom': 0,
'version': 0,
'baro_rate': 0,
'mlat': [],
'tisb': []
# similarly for the other fields
}
for dat in aircraft_json['aircraft']:
for field in defaults.keys():
if field not in dat:
dat[field] = defaults[field]
print(dat[field], end=' ')
print('')
The columns of my CSV file are : "1960","1961","1962","1963","1964","1965","1966","1967"
var Max = d3.max(data, function(d) {
return d.1967;
}
the above code doesn't work. Error shown in browser console :
SyntaxError: missing ; before statement
So how do I refer to these columns ?
After CSV file is read objects will be created with properties like:
{
1960: 55,
1961: 12,
...
1967: 77
}
Object property names can be specified as strings or numbers but numeric property names are converted to strings. To retrieve those properties you cannot use notation
d.'1961' or d.1961. You have to use d['1961'].