I have seen the terms "deserialize" and "serialize" with JSON. What do they mean?
JSON is a format that encodes objects in a string. Serialization means to convert an object into that string, and deserialization is its inverse operation (convert string -> object).
When transmitting data or storing them in a file, the data are required to be byte strings, but complex objects are seldom in this format. Serialization can convert these complex objects into byte strings for such use. After the byte strings are transmitted, the receiver will have to recover the original object from the byte string. This is known as deserialization.
Say, you have an object:
{foo: [1, 4, 7, 10], bar: "baz"}
serializing into JSON will convert it into a string:
'{"foo":[1,4,7,10],"bar":"baz"}'
which can be stored or sent through wire to anywhere. The receiver can then deserialize this string to get back the original object. {foo: [1, 4, 7, 10], bar: "baz"}.
Serialize and Deserialize
In the context of data storage, serialization (or serialisation) is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer) or transmitted (for example, across a network connection link) and reconstructed later. [...]
The opposite operation, extracting a data structure from a series of bytes, is deserialization.
– wikipedia.org
JSON
JSON (JavaScript Object Notation) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other serializable values). It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.
JSON is a language-independent data format. It was derived from JavaScript, but many modern programming languages include code to generate and parse JSON-format data. JSON filenames use the extension .json.
– wikipedia.org
Explained using Python
In Python serialization does nothing else than just converting the given data structure into its valid JSON pendant (e.g., Python's True will be converted to JSON's true and the dictionary itself will be converted to a string) and vice versa for deserialization.
Python vs. JSON
You can easily spot the difference between Python and JSON representations in a side-by-side comparison. For example, by examining their Boolean values. Have a look at the following table for the basic types used in both contexts:
Python
JSON
True
true
False
false
None
null
int, float
number
str (with single ', double " and tripple """ quotes)
string (only double " quotes)
dict
object
list, tuple
array
Code Example
Python builtin module json is the standard way to do serialization and deserialization:
import json
data = {
'president': {
"name": """Mr. Presidente""",
"male": True,
'age': 60,
'wife': None,
'cars': ('BMW', "Audi")
}
}
# serialize
json_data = json.dumps(data, indent=2)
print(json_data)
# {
# "president": {
# "name": "Mr. Presidente",
# "male": true,
# "age": 60,
# "wife": null,
# "cars": [
# "BMW",
# "Audi"
# ]
# }
# }
# deserialize
restored_data = json.loads(json_data) # deserialize
Sources: realpython.com, geeksforgeeks.org
Explanation of Serialize and Deserialize using Python
In python, pickle module is used for serialization. So, the serialization process is called pickling in Python. This module is available in Python standard library.
Serialization using pickle
import pickle
#the object to serialize
example_dic={1:"6",2:"2",3:"f"}
#where the bytes after serializing end up at, wb stands for write byte
pickle_out=open("dict.pickle","wb")
#Time to dump
pickle.dump(example_dic,pickle_out)
#whatever you open, you must close
pickle_out.close()
The PICKLE file (can be opened by a text editor like notepad) contains this (serialized data):
€}q (KX 6qKX 2qKX fqu.
Deserialization using pickle
import pickle
pickle_in=open("dict.pickle","rb")
get_deserialized_data_back=pickle.load(pickle_in)
print(get_deserialized_data_back)
Output:
{1: '6', 2: '2', 3: 'f'}
Share what I learned about this topic.
What is Serialization
Serialization is the process of converting a data object into a byte stream.
What is byte stream
Byte stream is just a stream of binary data. Because only binary data can be stored or transported.
What is byte string vs byte stream
Sometime you see people use the word byte string as well. String encodings of bytes are called byte strings. Then it can explain what is JSON as below.
What’s the relationship between JSON and serialization
JSON is a string format representational of byte data. JSON is encoded in UTF-8. So while we see human readable strings, behind the scenes strings are encoded as bytes in UTF-8.
Related
How remove from my json dump backslashes?
my python code is :
#sio.on('donation')
def on_message(data):
y = json.loads(data)
with open('donate.json', 'w') as outfile:
json.dump(data, outfile)
if i make print all fine and no backslashes!But if i open my json file he look like this :
"{\"id\":107864345,\"alert_type\":\"1\",\"is_shown\":\"0\",\"additional_data\":\"{\\\"randomness\\\":811}\",\"billing_system\":\"fake\",\"billing_system_type\":null,\"username\":\"test24\",\"amount\":\"1.00\",\"amount_formatted\":\"1\",\"amount_main\":1,\"currency\":\"USD\",\"message\":\"aaaaaa aaaa\",\"header\":\"\",\"date_created\":\"2022-12-17 21:57:10\",\"emotes\":null,\"ap_id\":null,\"_is_test_alert\":true,\"message_type\":\"text\",\"preset_id\":0}"
i try all what i know
Your code, annotated:
def on_message(data):
When this function is called, it is provided with the argument data, which is a string containing the JSON encoding for a complex object.
y = json.loads(data)
Now data is still the same string, and y is the complex object which was represented by data.
with open('donate.json', 'w') as outfile:
json.dump(data, outfile)
json.dump takes a data object and turns it into a string. With two arguments, as here, it writes the string to a file. But despite its name, data is not the dara object. It's a string. The data object is y.
json.dump will convert any Python object with a JSON representation to a string representing that object, and a string can be represented in JSON. So in this case, the string in data is encoded as a JSON representation. That means that the string must be enclosed in double quotes and any special characters escaped.
But that's not what you wanted. You wanted to dump the data object, which you have named y. Changing that line to
json.dump(y, outfile)
Will probably do what you want.
But if you just wanted to write out the string, there wasn't much point converting it to JSON and back to a string. You could just write it out:
outfile.write(data)
Then you can get rid of y (unless you need it somewhere else).
Could someone please specify difference between as_json and JSON.parse?
The only difference is that JSON.parse part of Ruby Standard Library and as_json part of Rails ActiveSupport?
Rails console:
irb(main):001:0> json_str = "{\"foo\": {\"bar\": 1, \"baz\": 2}, \"bat\": [0, 1, 2]}"
irb(main):002:0> puts JSON.parse(json_str)
{"foo"=>{"bar"=>1, "baz"=>2}, "bat"=>[0, 1, 2]}
irb(main):003:0> puts json_str.as_json
{"foo": {"bar": 1, "baz": 2}, "bat": [0, 1, 2]}
Ref:
https://ruby-doc.org/stdlib-3.1.2/libdoc/json/rdoc/JSON/Ext/Parser.html#method-i-parse
https://api.rubyonrails.org/classes/ActiveModel/Serializers/JSON.html#method-i-as_json
The two methods are in some sense exact opposites of each other:
JSON.parse parses a Ruby String containing a JSON Document into a Ruby object corresponding to the JSON Value described by the JSON document. So, it goes from JSON to Ruby.
as_json returns a simplified representation of a complex Ruby object that uses only Ruby types that can be easily represented as JSON Values. In other words, it turns an arbitrarily complex Ruby object into a Ruby object that uses only Hash, Array, String, Integer, Float, true, false, and nil which correspond to the JSON types object (really a dictionary), array, string, number, boolean, and null. The intent is that you can then easily serialize this simplified representation to JSON. So, as_json goes from Ruby (halfway) to JSON. In other words, the opposite direction from JSON.parse.
Apart from operating in opposite directions, there are some minor other differences as well:
JSON.parse is a concrete method, whereas as_json is an abstract protocol that is implemented by many different kinds of objects. (Similar to e.g. each in Ruby).
JSON.parse is part of the Ruby standard library (but not the core library, more precisely, it is part of the json gem, which is a default gem). The as_json protocol is defined by ActiveRecord's Serializers API, i.e. it is part of ActiveRecord, not Ruby.
So, why does as_json exist in the first place? Why this two-step process of converting complex Ruby objects to simpler Ruby objects and then to a JSON Document instead of going straight from complex Ruby objects to a JSON Document? Well, if you have complex Ruby objects, chances are, that no object actually fully knows how to serialize itself as a JSON Document. It has to first ask its constituent objects to serialize themselves, and then stitch it all together, and this applies recursively to the constituent objects as well. With all this stitching together of JSON Documents, there is a real risk of producing an invalid JSON Document or double-encoding some part of it, or something along that lines.
Basically, once you have serialized something to a JSON Document, then all you have is a String and all you can do is String manipulation. Whereas, if you have a richer Ruby object like Hash, Array, Integer, etc., then you can use that object's methods as well. Imagine, for example, having to merge two JSON Documents containing JSON Objects as a String compared to simply merging two Ruby Hashes.
So, the idea is to use as_json first to create a Ruby object that is simpler and less powerful than the original, but still much more powerful than a simple String. And only once you have assembled the entire thing, do you use to_json to serialize it to a JSON Document. (Or rather, the serialization framework does that for you.)
JSON.parse() parses the given JSON string and converts it to an Object,While the as_json Returns a hash representing the model.
user = User.first
user.as_json
=> {"id"=>1, "email"=>"fa18-bcs-215#cuilahore.edu.pk",
"name"=>"Noman", "user_type"=>"Manager"}
and if we apply as_json to string it simple return that string
json_str = "{\"foo\": {\"bar\": 1, \"baz\": 2}, \"bat\": [0, 1, 2]}"
json_str.as_json
=> "{\"foo\": {\"bar\": 1, \"baz\": 2}, \"bat\": [0, 1, 2]}"
if we apply JSON.parse() to string it returns hash.
JSON.parse(json_str)
=> {"foo"=>{"bar"=>1, "baz"=>2}, "bat"=>[0, 1, 2]}
JSON.parse parses a json string.
.as_json is a serialization method available to all data types, not just strings.
E.g. from the ruby docs:
user = User.find(1)
user.as_json
# => { "id" => 1, "name" => "Konata Izumi", "age" => 16,
# "created_at" => "2006-08-01T17:27:133.000Z", "awesome" => true}
But JSON.parse won't handle that:
user = User.find(1)
JSON.parse(user)
--> no implicit conversion of User into String (TypeError)
Edit
Edited based on comment from #Jorg.
JSON.parse does not return specifically a Hash.
I created a config file in JSON format, and I want to use KDB to read it in as a dictionary.
In Python, it's so easy:
with open('data.json') as f:
data = json.load(f)
Is there a similar function in KDB?
To read your JSON file into kdb+, you should use read0. This returns the lines of the file as a list of strings.
q)read0`:sample.json
,"{"
"\"name\":\"John\","
"\"age\":30,"
"\"cars\":[ \"Ford\", \"BMW\", \"Fiat\" ]"
,"}"
kdb+ allows for the de-serialisation (and serialisation) of JSON objects to dictionaries using the .j namespace. The inbuilt .j.k expects a single string of characters containing json and converts this into a dictionary. A raze should be used to flatten our list of strings:
q)raze read0`:sample.json
"{\"name\":\"John\",\"age\":30,\"cars\":[ \"Ford\", \"BMW\", \"Fiat\" ]}"
Finally, using .j.k on this string yields the dictionary
q).j.k raze read0`:sample.json
name| "John"
age | 30f
cars| ("Ford";"BMW";"Fiat")
For a particularly large JSON file, it may be more efficient to use read1 rather than raze read0 on your file, e.g.
q).j.k read1`:sample.json
name| "John"
age | 30f
cars| ("Ford";"BMW";"Fiat")
If you're interested in the reverse operation, you can use .j.j to convert a dictionary into a list of strings and use 0: to save.
Further information on the .j namespace can be found here.
You can also see more examples on the Kx wiki of read0, read1 and 0:.
Working with JSON is handled by the .j namespace where .j.j serialises and .j.k deserialises the messages. Note the you will need to use raze to convert the JSON into a single string first.
There is more information available on the Kx wiki, where the following example is presented:
q).j.k "{\"a\":[0,1],\"b\":[\"hello\",\"world\"]}"
a| 0 1
b| "hello" "world"
When using .j.j both symbols and strings in kdb will be encoded into a JSON string while kdb will decode JSON strings to kdb strings except keys where they will be symbols.
To encode a kdb table in JSON an array of objects with identical keys should be sent. kdb will also encode tables as arrays of objects in JSON.
q).j.k "[{\"a\":1,\"b\":2},{\"a\":3,\"b\":4}]"
a b
---
1 2
3 4
When encoding q will use the value of \P to choose the precision, which is by default 7 which could lead to unwanted rounding.
This can be changed with 0 meaning maximum precision although the final digits are unreliable as shown below. See here for more info https://code.kx.com/q/ref/cmdline/#-p-display-precision.
q).j.j 1.000001 1.0000001f
"[1.000001,1]"
q)\P 0
q).j.j 1.000001 1.0000001f
"[1.0000009999999999,1.0000001000000001]"
I have a Json file in which there is a field which I need to edit and save the file for next usage.
But the field which I need to edit is as shown below,
The value I need to assign fr the field is generated Randomly in run time which i'll be capturing in a variable and pass it to this json specific key "dp" then save the json.
The saved json will be used for REST POST url.
{
"p": "10",
"v": 100,
"vt": [
{
"dp": "Field to be edited"(integer value) ,
]
}
The simplest solution would be to write a python keyword that can change the value for you. However, you can solve this with robot keywords by performing the following steps:
convert the JSON string to a dictionary
modify the dictionary
convert the dictionary back to a JSON string
Convert the JSON string to a dictionary
Python has a module (json) for working with JSON data. You can use the evaluate keyword to convert your JSON string to a python dictionary using the loads (load string) method of that module.
Assuming your JSON data is in a robot variable named ${json_string}, you can convert it to a python dictionary like this:
${json}= evaluate json.loads('''${json_string}''') json
With the above, ${json} now holds a reference to a dictionary that contains all of the json data.
Modify the dictionary
The Collections library that comes with robot has a keyword named set to dictionary which can be used to set the value of a dictionary element. In this case, you need to change the value of a dictionary nested inside the vt element of the JSON object. We can reference that nested dictionary using robot's extended variable syntax.
For example:
set to dictionary ${json["vt"]} dp=the new value
With that, ${json} now has the new value. However, it is still a python dictionary rather than JSON data, so there's one more step.
Convert the dictionary back to JSON
Converting the dictionary back to JSON is the reverse of the first step. Namely, use the dumps (dump string) method of the json module:
${json_string}= evaluate json.dumps(${json}) json
With that, ${json_string} will contain a valid JSON string with the modified data.
Complete example
The following is a complete working example. The JSON string will be printed before and after the substitution of the new value:
*** Settings ***
Library Collections
*** Test Cases ***
Example
${json_string}= catenate
... {
... "p": "10",
... "v": 100,
... "vt": {
... "dp": "Field to be edited"
... }
... }
log to console \nOriginal JSON:\n${json_string}
${json}= evaluate json.loads('''${json_string}''') json
set to dictionary ${json["vt"]} dp=the new value
${json_string}= evaluate json.dumps(${json}) json
log to console \nNew JSON string:\n${json_string}
For reading and writing data to and from file I am using OperatingSystem library
${json} Get Binary File ${json_path}nameOfJsonFile.json
It works for me on API testing, to read .json and POST, like here
*** Settings ***
Library Collections
Library ExtendedRequestsLibrary
Library OperatingSystem
*** Variables ***
${uri} https://blabla.com/service/
${json_path} C:/home/user/project/src/json/
*** Test Cases ***
Robot Test Case
Create Session alias ${uri}
&{headers} Create Dictionary Content-Type=application/json; charset=utf-8
${json} Get Binary File ${json_path}nameOfJsonFile.json
${resp} Post Request alias data=${json} headers=${headers}
Should Be Equal As Strings ${resp.status_code} 200
For integer values in JSON, the other answers did not work for me.
This worked:
${json}= Catenate { "p": "10", "v": 100, "vt": { "dp": "Field to be edited" } }
${value} Set Variable 2 #the value you want
${value} Convert To Integer ${value}
${json}= Evaluate json.loads('''${json}''') json
Set To Dictionary ${json["vt"]} dp=${value}
${json}= Evaluate json.dumps(${json}) json
Log ${json}
Convert To Integer was required, otherwise the value is still in string "${value}"
What is the fastest way to convert this
{"a":"ab","b":"cd","c":"cd","d":"de","e":"ef","f":"fg"}
into mutable map in scala ? I read this input string from ~500MB file. That is the reason I'm concerned about speed.
If your JSON is as simple as in your example, i.e. a sequence of key/value pairs, where each value is a string. You can do in plain Scala :
myString.substring(1, myString.length - 1)
.split(",")
.map(_.split(":"))
.map { case Array(k, v) => (k.substring(1, k.length-1), v.substring(1, v.length-1))}
.toMap
That looks like a JSON file, as Andrey says. You should consider this answer. It gives some example Scala code. Also, this answer gives some different JSON libraries and their relative merits.
The fastest way to read tree data structures in XML or JSON is by applying streaming API: Jackson Streaming API To Read And Write JSON.
Streaming would split your input into tokens like 'beginning of an object' or 'beginning of an array' and you would need to build a parser for these token, which in some cases is not a trivial task.
Keeping it simple. If reading a json string from file and converting to scala map
import spray.json._
import DefaultJsonProtocol._
val jsonStr = Source.fromFile(jsonFilePath).mkString
val jsonDoc=jsonStr.parseJson
val map_doc=jsonDoc.convertTo[Map[String, JsValue]]
// Get a Map key value
val key_value=map_doc.get("key").get.convertTo[String]
// If nested json, re-map it.
val key_map=map_doc.get("nested_key").get.convertTo[Map[String, JsValue]]
println("Nested Value " + key_map.get("key").get)