saving the cppheaderparser output as valid json - json

the python program
http://sourceforge.net/projects/cppheaderparser/
can parse a c++ header file and store the info (about classes etc) in a python dictionary.
Using the included example program readSampleClass.py and
data_string = ( repr(cppHeader) )
with open('data.txt', 'w') as outfile:
json.dumps(data_string,outfile)
it saved the output but it is not valid json as
it uses single, not double quotes and key part is not quoted.
sample of output: (reduced)
{'enums': [], 'variables': [], 'classes':
{'SampleClass':
{'inherits': [], 'line_number': 8, 'declaration_method': 'class', 'typedefs':
{'public': [], 'private': [], 'protected': []
}, 'abstract': False, 'parent': None,'parent': None, 'reference': 0, 'constant': 0, 'aliases': [], 'raw_type': 'void', 'typedef': None, 'mutable': False
}], 'virtual': False, 'rtnType': 'int', 'returns_class': False, 'name': 'anotherFreeFunction', 'constructor': False, 'inline': False, 'returns_pointer': 0, 'defined': False
}]
}
so the question is:
How can I make it use double quotes and not single and how can I also make it quote the value part. Like False in sample.
I assume is possible as the creator of cppheaderparser wrote
about json.dumps(repr(cppHeader))
https://twitter.com/senexcanis/status/559444754166198272
Why use the json lib if its not valid jason?
That said I have never used python before and it might just not work as i think.
-update-
After some json doc reading, i gave up on json.dump as it seems to do nothing to the output in this case.
I ended up doing
data_string = ( repr(cppHeader) )
data_string = string.replace(data_string,'\'', '\"')
data_string = string.replace(data_string,'False', '\"False\"')
data_string = string.replace(data_string,'True', '\"True\"')
data_string = string.replace(data_string,'None', '\"None\"')
data_string = string.replace(data_string,'...', '')
with open('data.txt', 'w') as outfile:
outfile.write (data_string)
which give valid json - at least for my test c++ headers.
-update 2-
The creator of cppheaderparse just released a new 2.6 version where its possible to write CppHeaderParser.CppHeader("yourHeader.h").toJSON() to save as json.

Related

Python Regex: How to match the string and then modify that string by adding something at the end

UPDATED CODE: It is working but now the problem is that the code is attaching same random_value to every Path.
Following is my code with a sample chunk of text. I want to read Path and it's value then add (/some unique random alphabet and number combination) at the end of every Path value without changing the already existed value. For example I want the Path to be like
"Path" : "already existed value/1A" e.t.c something like that.
I am unable to make the exact regex pattern of replacing it.
Any help would be appreciated.
It can be done by json parse but the requirement of the task is to do it via REGEX.
from io import StringIO
import re
import string
import random
reader = StringIO("""{
"Bounds": [
{
"HasClip": true,
"Lang": "no",
"Page": 0,
"Path": "//Document/Sect[2]/Aside/P",
"Text": "Potsdam, den 9. Juni 2021 ",
"TextSize": 12.0
}
],
},
{
"Bounds": [
{
"HasClip": true,
"Lang": "de",
"Page": 0,
"Path": "//Document/Sect[3]/P[4]",
"Text": "this is some text ",
"TextSize": 9.0,
}
],
}""")
def id_generator(size=3, chars=string.ascii_uppercase + string.digits):
return ''.join(random.choice(chars) for _ in range(size))
text = reader.read()
random_value = id_generator()
pattern = r'"Path": "(.*?)"'
replacement = '"Path": "\\1/'+random_value+'"'
text = re.sub(pattern, replacement, text)
#This is working but it is only attaching one same random_value on every Path
print(text)
Use group 1 in the replacement:
replacement = '"Path": "\\1/1A"'
See live demo.
The replacement regex \1 puts back what was captured in group 1 of the match via (.*?).
Since you already have a json structure, maybe it would help to use the json module to parse it.
import json
myDict = json.loads("your json string / variable here")
# now myDict is a dictionary that you can use to loop/read/edit/modify and you can then export myDict as json.

Why can't a select values not in double quotes from a json file in python programming

import json
f = open(r'C:\Users\Arun\Documents\Input.json',)
data = json.load(f)
json.dumps(data, separators=(",", ":"))
for i in data["notificationChannels"]:
if i["enabled"] == "true":
print(i)
But I can get the values if I use if i["type"] == "BEAN": , Why can't I pull values which are not in double quotes
Sample Json File Content:
{
"notificationChannels": [
{
"type": "BEAN",
"enabled": true,
I believe that the "json" library was build to read the value of true (without double quotes) as boolean True, so that the comparison to string "true" is wrong in the first place.
My suggestion to try:
...
if i["enabled"] == True:
...

Getting values from Json data in Python

I have a json file that I am trying to pull specific attribute data from. The Json data is essentially a dictionary. Before the data is turned into a file, it is first held in a variable like this:
params = {'f': 'json', 'where': '1=1', 'geometryType': 'esriGeometryPolygon', 'spatialRel': 'esriSpatialRelIntersects','outFields': '*', 'returnGeometry': 'true'}
r = requests.get('https://hazards.fema.gov/gis/nfhl/rest/services/CSLF/Prelim_CSLF/MapServer/3/query', params)
cslfJson = r.json()
and then written into a file like this:
path = r"C:/Workspace/Sandbox/ScratchTests/cslf.json"
with open(path, 'w') as f:
json.dump(cslfJson, f, indent=2)
within this json data is an attribute called DFIRM_ID. I want to create an empty list called dfirm_id = [], get all of the values for DFIRM_ID and for that value, append it to the list like this dfirm_id.append(value). I am thinking I need to somehow read through the json variable data or the actual file, but I am not sure how to do it. Any suggestions on an easy method to accomplish this?
dfirm_id = []
for k, v in cslf:
if cslf[k] == 'DFIRM_ID':
dfirm.append(cslf[v])
As requested, here is what print(cslfJson) looks like:
It actually prints a huge dictionary that looks like this:
{'displayFieldName': 'CSLF_ID', 'fieldAliases': {'OBJECTID':
'OBJECTID', 'CSLF_ID': 'CSLF_ID', 'Area_SF': 'Area_SF', 'Pre_Zone':
'Pre_Zone', 'Pre_ZoneST': 'Pre_ZoneST', 'PRE_SRCCIT': 'PRE_SRCCIT',
'NEW_ZONE': 'NEW_ZONE', 'NEW_ZONEST': .... {'attributes': {'OBJECTID':
26, 'CSLF_ID': '13245C_26', 'Area_SF': 5.855231804165408e-05,
'Pre_Zone': 'X', 'Pre_ZoneST': '0.2 PCT ANNUAL CHANCE FLOOD HAZARD',
'PRE_SRCCIT': '13245C_STUDY1', 'NEW_ZONE': 'A', 'NEW_ZONEST': None,
'NEW_SRCCIT': '13245C_STUDY2', 'CHHACHG': 'None (Zero)', 'SFHACHG':
'Increase', 'FLDWYCHG': 'None (Zero)', 'NONSFHACHG': 'Decrease',
'STRUCTURES': None, 'POPULATION': None, 'HUC8_CODE': None, 'CASE_NO':
None, 'VERSION_ID': '2.3.3.3', 'SOURCE_CIT': '13245C_STUDY2', 'CID':
'13245C', 'Pre_BFE': -9999, 'Pre_BFE_LEN_UNIT': None, 'New_BFE':
-9999, 'New_BFE_LEN_UNIT': None, 'BFECHG': 'False', 'ZONECHG': 'True', 'ZONESTCHG': 'True', 'DFIRM_ID': '13245C', 'SHAPE_Length':
0.009178426056888393, 'SHAPE_Area': 4.711699932249018e-07, 'UID': 'f0125a91-2331-4318-9a50-d77d042a48c3'}}, {'attributes': .....}
If your json data is already a dictionary, then take advantage of that. The beauty of a dictionary / hashmap is that it provides an average time complexity of O(1).
Based on your comment, I believe this will solve your problem:
dfirm_id = []
for feature in cslf['features']:
dfirm_id.append(feature['attributes']['DFIRM_ID'])

Emit Python embedded object as native JSON in YAML document

I'm importing webservice tests from Excel and serialising them as YAML.
But taking advantage of YAML being a superset of JSON I'd like the request part of the test to be valid JSON, i.e. to have delimeters, quotes and commas.
This will allow us to cut and paste requests between the automated test suite and manual test tools (e.g. Postman.)
So here's how I'd like a test to look (simplified):
- properties:
METHOD: GET
TYPE: ADDRESS
Request URL: /addresses
testCaseId: TC2
request:
{
"unitTypeCode": "",
"unitNumber": "15",
"levelTypeCode": "L",
"roadNumber1": "810",
"roadName": "HAY",
"roadTypeCode": "ST",
"localityName": "PERTH",
"postcode": "6000",
"stateTerritoryCode": "WA"
}
In Python, my request object has a dict attribute called fields which is the part of the object to be serialised as JSON. This is what I tried:
import yaml
def request_presenter(dumper, request):
json_string = json.dumps(request.fields, indent=8)
return dumper.represent_str(json_string)
yaml.add_representer(Request, request_presenter)
test = Test(...including embedded request object)
serialised_test = yaml.dump(test)
I'm getting:
- properties:
METHOD: GET
TYPE: ADDRESS
Request URL: /addresses
testCaseId: TC2
request: "{
\"unitTypeCode\": \"\",\n
\"unitNumber\": \"15\",\n
\"levelTypeCode": \"L\",\n
\"roadNumber1\": \"810\",\n
\"roadName\": \"HAY\",\n
\"roadTypeCode\": \"ST\",\n
\"localityName\": \"PERTH\",\n
\"postcode\": \"6000\",\n
\"stateTerritoryCode\": \"WA\"\n
}"
...only worse because it's all on one line and has white space all over the place.
I tried using the | style for literal multi-line strings which helps with the line breaks and escaped quotes (it's more involved but this answer was helpful.) However, escaped or multiline, the result is still a string that will need to be parsed separately.
How can I stop PyYaml analysing the JSON block as a string and make it just accept a block of text as part of the emitted YAML? I'm guessing it's something to do with overriding the emitter but I could use some help. If possible I'd like to avoid post-processing the serialised test to achieve this.
Ok, so this was the solution I came up with. Generate the YAML with a placemarker ahead of time. The placemarker marks the place where the JSON should be inserted, and also defines the root-level indentation of the JSON block.
import os
import itertools
import json
def insert_json_in_yaml(pre_insert_yaml, key, obj_to_serialise):
marker = '%s: null' % key
marker_line = line_of_first_occurrence(pre_insert_yaml, marker)
marker_indent = string_indent(marker_line)
serialised = json.dumps(obj_to_serialise, indent=marker_indent + 4)
key_with_json = '%s: %s' % (key, serialised)
serialised_with_json = pre_insert_yaml.replace(marker, key_with_json)
return serialised_with_json
def line_of_first_occurrence(basestring, substring):
"""
return line number of first occurrence of substring
"""
lineno = lineno_of_first_occurrence(basestring, substring)
return basestring.split(os.linesep)[lineno]
def string_indent(s):
"""
return indentation of a string (no of spaces before a nonspace)
"""
spaces = ''.join(itertools.takewhile(lambda c: c == ' ', s))
return len(spaces)
def lineno_of_first_occurrence(basestring, substring):
"""
return line number of first occurrence of substring
"""
return basestring[:basestring.index(substring)].count(os.linesep)
embedded_object = {
"unitTypeCode": "",
"unitNumber": "15",
"levelTypeCode": "L",
"roadNumber1": "810",
"roadName": "HAY",
"roadTypeCode": "ST",
"localityName": "PERTH",
"postcode": "6000",
"stateTerritoryCode": "WA"
}
yaml_string = """
---
- properties:
METHOD: GET
TYPE: ADDRESS
Request URL: /addresses
testCaseId: TC2
request: null
after_request: another value
"""
>>> print(insert_json_in_yaml(yaml_string, 'request', embedded_object))
- properties:
METHOD: GET
TYPE: ADDRESS
Request URL: /addresses
testCaseId: TC2
request: {
"unitTypeCode": "",
"unitNumber": "15",
"levelTypeCode": "L",
"roadNumber1": "810",
"roadName": "HAY",
"roadTypeCode": "ST",
"localityName": "PERTH",
"postcode": "6000",
"stateTerritoryCode": "WA"
}
after_request: another value

json.loads or json.load() on tweet result

I use this code to get the tweet from the feed which i write inside a file.
When i read the file and try to json the lines i always get an ERROR.
def SearchTwt(api):
os.chdir('/Users/me/Desktop')
SearchResult = api.search( q='market',lang='en',rpp=20)
text_file = open("TweetOut.txt", "w")
for tw in SearchResult:
text_file.write(str(tw))
print(str(tw))
text_file.close()
I read the file with:
def readfile():
tweets_data = []
os.chdir('/Users/me/Desktop')
file = open("TweetOut.txt", "r")
for line in file:
parts = line.split("Status(")
print (len(parts))
for part in parts:
tweet = 'Status('+part
if len(tweet) > 10:
tweetj = json.loads(tweet)
#tweets_data.append(tweet)
print(tweet)
file.close()
May be this is wrong to fill the file with str(tw)? Yes I rebuild the string during the reading because i thought the tweet started like that. So may be another mistake.
I tried a lot of other options.
the error:
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
the file starts like this (edited the url as asked by stack):
Status(source='SocialFlow', id=757991135465857024, in_reply_to_status_id=None, is_quote_status=False, entities={'hashtags': [], 'user_mentions': [], 'symbols': [], 'urls': [{'url': '', 'expanded_url': '', 'display_url':
The file is not valid JSON. It should be something like
{
"source": "SocialFlow",
"id":"757991135465857024",
...
"entities": {
"hashtags": [],
"user_mentions": [],
...
}
}
Because it is not valid json you either have to parse it a different way, or be sure to write it as json when you save the file.