how to save json data into csv using python - json

[I have json data like this 1]
I wanted to save the json into csv
the out put will be like this ,each tittle will be holding the information in that titile

I hope this gets converted to a comment, but look at Pandas, it can probably do what you want (Pandas json to csv)

Related

Creating individual JSON files from a CSV file that is already in JSON format

I have JSON data in a CVS file that I need to break apart into seperate JSON files. The data looks like this: {"EventMode":"","CalculateTax":"Y",.... There are multiple rows of this and I want each row to be a separate JSON file. I have used code provided by Jatin Grover that parses the CVS into JSON:
lcount = 0
out = json.dumps(row)
jsonoutput = open( 'json_file_path/parsedJSONfile'+str(lcount)+'.json', 'w')
jsonoutput.write(out)
lcount+=1
This does an excellent job the problem is it adds "R": " before the {"EventMode... and adds extra \ between each element as well as item at the end.
Each row of the CVS file is already valid JSON objects. I just need to break each row into a separate file with the .json extension.
I hope that makes sense. I am very new to this all.
It's not clear from your picture what your CSV actually looks like.
I mocked up a really small CSV with JSON lines that looks like this:
Request
"{""id"":""1"", ""name"":""alice""}"
"{""id"":""2"", ""name"":""bob""}"
(all the double-quotes are for escaping the quotes that are part of the JSON)
When I run this little script:
import csv
with open('input.csv', newline='') as input_file:
reader = csv.reader(input_file)
next(reader) # discard/skip the fist line ("header")
for i, row in enumerate(reader):
with open(f'json_file_path/parsedJSONfile{i}.json', 'w') as output_file:
output_file.write(row[0])
I get two files, json_file_path/parsedJSONfile0.json and json_file_path/parsedJSONfile1.json, that look like this:
{"id":"1", "name":"Alice"}
and
{"id":"2", "name":"bob"}
Note that I'm not using json.dumps(...), that only makes sense if you are starting with data inside Python and want to save it as JSON. Your file just has text that is complete JSON, so basically copy-paste each line as-is to a new file.

Apache Nifi : How to create parquet file from CSV file with schema saved in "avro.schema" attribute

I am trying to create a parquet file from a CSV file using Apache Nifi.
I am able to convert the CSV to parquet file, but the problem is, the schema of the parquet file contains struct type(Which I need to overcome) and convert it into string type.
I am using Apache Nifi 1.14.0 on Windows Server 2016.
This is what I've tried to convert CSV to parquet till now...
I have used the below 3 controllers
CSVReader
CSVRecordSetWriter
ParquetRecordSetWriter
And, These are the processors/Flow
GetFile
ConvertRecord(CSVReader to CSVRecordSetWriter and this will automatically generate "avro.schema" attribute and in next step I am updating this attribute)
UpdateAttribute(Updating "avro.schema" attribute, where ever I've got 2 data types inferred, I am replacing it to '["null","string"]')
ConvertRecord(CSVReader to ParquetRecordSetWriter)
UpdatedAttribute(For appending '.parquet' in the filename)
PutFile
I also want to know, how to view a .parquet file in Windows OS. Currently, I am reading the parquet file via PySpark and checking the schema. :|
This is how parquet file schema looks like after conversion. I want string instead of Struct as output.
Please Note: There are lots of CSVs with many columns/fields. I don't want to create schema manually.
OR
Any other ways to achieve this would be very helpfull.
Thanks!
After playing around with some more options of "ParquetRecordSetWriter", I was able to create a parquet file with the schema that I've captured in "avro.schema" attribute.

Writing spark dataframe to ascii JSON

I am attempting to write a spark dataframe as JSON file; this will eventually be written out into MapR JSON DB table.
grp_small.toJSON.write.save("<path>")
This seems to write JSON file in snappy.parquet format. How do I force it to write it as a readable JSON (txt format) ?
You can write dataframe to json which contains each row as readable json in each line.
grp_small.write.json("path to output")
Hope this hepls!

DataFrame write to CSV not supporting some characters

I am trying to parse the XML file and write to DataFrame result to CSV file.
My problem is some of characters are not supported when i write the output to the CSV. For eg, there is a field Nectarine tree named ‘Polar Zee’ its writes like Nectarine tree named ‘Polar Zee’.
Is there any settings need to be change? or any properties need to be added?

How do I read a list of JSON files from file in python?

I have a list of JSON files saved to disk that I would like to read. Sometimes the JSON files span more than one line and so, I think that a simple list comprehension that loops over open(file,'rb').readlines() will fail.
The files are surrounded in brackets and so passing them to json.load or json.loads won't work.
An example file would be:
[{key:value,key2:value2},{morekeys:morevalues},{evenmorekeys,evenmorevalues}]
What is the best/ most Pythonic way to read a saved list of JSON entries when the entries span more than one line?
Your example is valid json. [] define json arrays. What you have is an array of objects:
with open("myFile.json") as f:
objects = json.load(f)