So I already wrote this code that works, but it feels like there must be a better way to do it:
# json_data is a passed in JSON object
str_json = str(json_data)
# convert the string to python dictionary
new_dict = json.loads(str_json)
The json module is your friend.
json.loads
is the method you should use. It's the best way.
I have in a text database field a json encoded chart configuration in the form of:
{"Name":[[1,1],[1,2],[2,1]],"Name2":[[3,2]]}
The first number of these IDs is a primary key of another table. I'd like to remove those entries with a trigger when the row is deleted, a plperl function would be good except it does not preserve the order of the hash and the order is important in this project. What can I do (without changing the format of the json encoded config)? Note: the chart name can contain any characters so it's hard to do it with regex.
You need to use a streaming JSON decoder, such as JSON::Streaming::Reader. You could then store your JSON as an array of key/value pairs, instead of a hash.
The actual implementation of how you might use do this is highly dependent on the structure of your data, but given the simple example provided... here's a simple implementation.
use strict;
use warnings;
use JSON::Streaming::Reader;
use JSON 'to_json';
my $s = '{"Name":[[1,1],[1,2],[2,1]],"Name2":[[3,2]]}';
my $jsonr = JSON::Streaming::Reader->for_string($s);
my #data;
while (my $token = $jsonr->get_token) {
my ($key, $value) = #$token;
if ($key eq 'start_property') {
push #data, { $value => $jsonr->slurp };
}
}
print to_json(\#data);
The output for this script is always: -
[{"Name":[[1,1],[1,2],[2,1]]},{"Name2":[[3,2]]}]
Well, I managed to solve my problem, but it's not a general solution so it will probably not help the casual reader. Anyway I got the order of keys using the help of the database, I called my function like this:
SELECT remove_from_chart(
chart_config,
array(select * from json_object_keys(chart_config::json)),
id);
then I walked through the keys in the order of the second parameter and put the results in a new tied (IxHash) hash and json encoded it.
It's pretty sad that there is no perl json decoder that could preserve the key order when everything else I work with, at least on this project, does it (php, postgres, firefox, chrome).
JSON objects are unordered. You will have to encode the desired order into your data somehow
{"Name":[[1,1],[1,2],[2,1]],"Name2":[[3,2]], "__order__":["Name","Name2"]}
[{"Name":[[1,1],[1,2],[2,1]]},{"Name2":[[3,2]]}]
May be you want streaming decoder of JSON data like SAX parser. If so then see JSON::Streaming::Reader, or JSON::SL.
I have this complex JSON Format. I don't know Whether is this valid json format or not because the key i am looking for is not in the first two dictionaries but is there in the third dictionary ? Is this valid json format ? If so , how to do parsing in this scenario ?
As am not allowed to ask new question . I edited my old question.
At the end of the first two if blocks, if you put a return statement, that will stop the program from going to execute the Alamofire block.
I have a sequence file containing multiple json records. I want to send every json record to a function . How can I extract one json record at a time?
Unfortunately there is no standard way to do this.
Unlike YAML which has a well-defined way to allow one file contain multiple YAML "documents", JSON does not have such standards.
One way to solve your problem is to invent your own "object separator". For example, you can use newline characters to separate adjacent JSON objects. You can tell your JSON encoder not to output any newline characters (by forcing escaping it into \ and n). As long as your JSON decoder is sure that it will not see any newline character unless it separates two JSON objects, it can read the stream one line at a time and decode each line.
It has also been suggested that you can use JSON arrays to store multiple JSON objects, but it will no longer be a "stream".
You can read content of your sequence files to RDD[String] and convert it to Spark Dataframe.
val seqFileContent = sc
.sequenceFile[LongWritable, BytesWritable](inputFilename)
.map(x => new String(x._2.getBytes))
val dataframeFromJson = sqlContext.read.json(seqFileContent)
I want to convert my nested json into csv ,i used
df.write.format("com.databricks.spark.csv").option("header", "true").save("mydata.csv")
But it can use to normal json but not nested json. Anyway that I can convert my nested json to csv?help will be appreciated,Thanks!
When you ask Spark to convert a JSON structure to a CSV, Spark can only map the first level of the JSON.
This happens because of the simplicity of the CSV files. It is just asigning a value to a name. That is why {"name1":"value1", "name2":"value2"...} can be represented as a CSV with this structure:
name1,name2, ...
value1,value2,...
In your case, you are converting a JSON with several levels, so Spark exception is saying that it cannot figure out how to convert such a complex structure into a CSV.
If you try to add only a second level to your JSON, it will work, but be careful. It will remove the names of the second level to include only the values in an array.
You can have a look at this link to see the example for json datasets. It includes an example.
As I have no information about the nature of the data, I can't say much more about it. But if you need to write the information as a CSV you will need to simplify the structure of your data.
Read json file in spark and create dataframe.
val path = "examples/src/main/resources/people.json"
val people = sqlContext.read.json(path)
Save the dataframe using spark-csv
people.write
.format("com.databricks.spark.csv")
.option("header", "true")
.save("newcars.csv")
Source :
read json
save to csv