mochijson2 or mochijson - json

I'm encoding some data using mochijson2.
But I found that it behaves strange on strings as lists.
Example:
mochijson2:encode("foo").
[91,"102",44,"111",44,"111",93]
Where "102", "111", "111" are $f, $o, $o encoded as strings
44 are commas and 91 and 93 are square brakets.
Of course if I output this somewhere I'll get string "[102,111,111]" which is obviously not that what I what.
If i try
mochijson2:encode(<<"foo">>).
[34,<<"foo">>,34]
So I again i get a list of two doublequotes and binary part within which can be translated to binary with list_to_binary/1
Here is the question - why is it so inconsistent. I understand that there is a problem distingushing erlang list that should be encoded as json array and erlang string which should be encoded as json string, but at least can it output binary when i pass it binary?
And the second question:
Looks like mochijson outputs everything nice (cause it uses special tuple to designate arrays {array, ...})
mochijson:encode(<<"foo">>).
"\"foo\""
What's the difference between mochijson2 and mochijson? Performance? Unicode handling? Anything else?
Thanks

My guess is that the decision in mochijson is that it treats a binary as a string, and it treats a list of integers as a list of integers. (Un?)fortunately strings in Erlang are in fact a list of integers.
As a result your "foo", or in other words, your [102,111,111] is translated into text representing "[102,111,111]". In the second case your <<"foo">> string becomes "foo"
Regarding the second question, mochijson seems to always return a string, whereas mochijson2 returns an iodata type. Iodata is basically a recursive list of strings, binaries and iodatas (in fact iolists). If you only intend to send the result "through the wire", it is more efficient to just nest them in a list than convert them to a flat string.

Related

ERROR: malformed array literal in PostgreSQL

I want filter on integer array in postgresql but when I am executing below query its giving me malformed array literal error.
select * from querytesting where 1111111111 = any((jsondoc->>'PhoneNumber')::integer[]);
Open image for reference-
https://i.stack.imgur.com/Py3Z2.png
any(x) wants a PostgreSQL array as x. (jsondoc->>'PhoneNumber'), however, is giving you a text representation of a JSON array. A PostgreSQL array would look like this as text:
'{1,2,3}'
but the JSON version you get from ->> would look like:
'[1,2,3]'
You can't mix the two types of array.
You could use a JSON operator instead:
jsondoc->'PhoneNumber' #> 1111111111::text::jsonb
Using -> instead of ->> gives you a JSON array rather than text. Then you can see if the number you're looking for is in that array with #>. The double cast (::text::jsonb) is needed to convert the PostgreSQL number to a JSON number for the #> operator.
As an aside, storing phone numbers as numbers might not be the best idea. You don't do arithmetic on phone numbers so they're not really numbers at all, they're really strings that contain digit characters. Normalizing the phone number format to international standards and then treating them as strings will probably serve you better in the long term.

TCL Dict to JSON

I am trying to convert a dict into JSON format and not seeing any easy method using TclLib Json Package. Say, I have defined a dict as follows :
set countryDict [dict create USA {population 300 capital DC} Canada {population 30 capital Ottawa}]
I want to convert this to json format as shown below:
{
"USA": {
"population": 300,
"captial": "DC"
},
"Canada": {
"population": 30,
"captial": "Ottawa"
}
}
("population" is number and capital is string). I am using TclLib json package (https://wiki.tcl-lang.org/page/Tcllib+JSON) . Any help would be much appreciated.
There's two problems with the “go straight there” approach that you appear to be hoping for:
Tcl's type system is extremely different to JSON's; in Tcl, every value is a (subtype of) string, but JSON expects objects, arrays, numbers and strings to wholly different things.
The capital becomes captial. For bonus fun. (Hopefully that's just a typo on your part, but we'll cope.)
I'd advise using rl_json for this; it's a much more capable package that treats JSON as a fundamental type. (It's even better at it when it comes to querying into the JSON structure.)
package require rl_json
set result {{}}; # Literal empty JSON object
dict for {countryID data} $countryDict {
rl_json::json set result $countryID [rl_json::json template {{
"population": "~N:population",
"captial": "~S:capital"
}} $data]
# Yes, that was {{ … }}, the outer ones are for Tcl & the inner ones for a JSON object
}
puts [rl_json::json pretty $result]
That produces almost exactly the output you asked for, except with different indentation. $result is the “production” version of the output that you can work with for further processing, but which has no excess whitespace at all (which is a great choice when you're dealing with documents over 100MB long).
Notes:
The initial JSON object could have been done like this:
set result "{}"
that would have worked just as well (and been the same Tcl bytecode).
json set puts an item into an object or array; that's exactly what we want here (in a dict for to go over the input data).
json template takes an optional dictionary for mapping substitution names in the template to values; that's perfect for your use case. Otherwise we'd have had to do dict with data {} to map the contents of the dictionary into variables, and that's less than perfect when the input data isn't strictly controlled.
The contents of template argument to json template is itself JSON. The ~N: prefix in a leaf string value says “replace this with a number from the substitution called…”, and ~S: says “replace this with a string from the substitution called…”. There are others.

Clojure: Escaping unicode `\U` in JSON encoding

Postamble for future readers
Elm allows literal C:\Users\myuser in strings
This is consistent with the JSON spec
My problem was unrelated to this, but several layers of escaping convoluted the problem. Future lesson: fully producing a minimal working example would have found the error!
Original question
I have a Clojure backend that talks to an Elm frontend. I hit a bump when decoding JSON values in Elm.
\U below means the literal characters backslash and U, as if read from a text file. "\\U" is the same string as input in Clojure and Elm source (\ must be escaped). Note enclosing "".
Problem: encoding \U
The literal string \U, escaped "\\U" is not accepted by the Elm string decoder.
A blog post suggests that to obtain the literal string \U, this should be encoded in source code as "\\\\U", "escaping the unicode escape".
The literal string I want to send to the client is C:\Users\myuser. I prefer to send valid JSON from the server to the client.
Clojure standard library behavior
clojure.data.json does not do anything special for strings containing the literal \U. The example below shows that \U and \m are threated equally, the backslash is escaped, and the following character ignored.
project.core> (clojure.data.json/write-str "C:\\Users\\myuser")
"\"C:\\\\Users\\\\myuser\""
Manual workaround
Temporary workaround is manually escaping the strings I need:
(defn escape-backslash-u [s]
(clojure.string/replace s "\\U" "\\\\U"))
Concrete questions
Is clojure.data.json/write-str behaving correctly? As I understand the documentation, output should be valid unicode.
Are other JSON libraries behaving similarly?
Is Elm's Json.Decode behaving correctly by rejecting the literal string \U?
Solution progress
A friendly Clojurians Slack user pointed to the JSON standard specification, specifically sections 7. Strings and 8.2. Unicode characters.
I think you may be on the wrong track here.
The string you gave as an example, "C:\\Users\\myuser" is completely unproblematic, it does not contain any Unicode escape sequences. It is a string containing the ASCII characters ‘C’, ‘:’, ‘\’, ‘U’, and so on. The backslash is the escape character in Clojure strings so it needs to be escaped itself to represent a literal backslash.
In any case the string "C:\\Users\\myuser" can be serialized with (clojure.data.json/write-str "C:\\Users\\myuser"), and, as you know, this gives "\"C:\\\\Users\\\\myuser\"". All of this seems perfectly straightforward and sound.
Printing "\"C:\\\\Users\\\\myuser\"" results in the original string "C:\\Users\\myuser" being printed. That string is accepted as valid by JSONLint, again as expected.
I understood it as Elm beeing unable to decode \"C:\\\\User... to "C:\\User... because it interprets \u as start for an escape sequence.
I tried elm here with the following code:
import Html exposing (text)
main =
text "\"c:\\\\user\\\\foo\"" // from clojure.data.json/write-str
which in turn compiles/runs to
"c:\\user\\foo"
which looks fine to me.
Are you sure there is nothing else going on (middleware, transport) ?

What is the format of this data?

I'm sorry if this is really trivial, but I've got a series of data as follows:
{"color":{"red":255,"blue":123,"green",1}}
I know it's in this format because, for some reason, it's easy to work with. What is this format called so that I might look it up?
\edit: If there is any significance to the organization of the data, of course.
That's JSON, a serialised text data storage based on a subset of JavaScript Object Notation. To learn more about JSON, vist: http://json.org
In JSON, there are the following data types:
object
array
number
string
null
boolean
Objects are represented using the following syntax, and are key-value pairings, similar to a dictionary (the key must be a string):
{ "number": 1, "string": "test" }
Like dictionaries, objects are unordered.
An array is a ordered, heterogeneous data structure, represented using the following syntax:
[0, true, false, "1", null]
Numbers are what you'd expect, however unlike JavaScript itself they cannot be Infinity or NaN (i.e. they must be decimals or integers) and contain no leading 0s. Exponents are represented using the following format (the e is not case sensitive):
10e6
where 10 is the base and 6 is the exponent - this is equivalent to 1000000 (1 million). The exponent section may have leading 0s, though there is not much point and may lower compatibility with parsers which are not 100% compliant.
Booleans are case sensitive and are both lowercase. In JSON, there are only two booleans:
true
false
To represent an intentionally left out or otherwise unknown field, use null (case sensitive too).
Strings must be delimited using double quotes (single quotes are invalid syntax), and single quotes need not be escaped.
"This string is valid, and it's alright to do this."
'No, this won't work'
'Nor will this.'
There are numerous escapes available using the backslash character - to use a literal backslash, use \\.
As JSON is a data transmission format, there is no syntax for comments available.

Can JSON start with "["?

From what I can read on json.org, all JSON strings should start with { (curly brace), and [ characters (square brackets) represent an array element in JSON.
I use the json4j library, and I got an input that starts with [, so I didn't think this was valid JSON. I looked briefly at the JSON schema, but I couldn't really find it stated that a JSON file cannot start with [, or that it can only start with {.
JSON can be either an array or an object. Specifically off of json.org:
JSON is built on two structures:
A collection of name/value pairs. In various languages, this is
realized as an object, record,
struct, dictionary, hash table,
keyed list, or associative array.
An ordered list of values. In most languages, this is realized as an
array, vector, list, or sequence.
It then goes on to describe the two structures as:
Note that the starting and ending characters are curly brackets and square brackets respectively.
Edit
And from here: http://www.ietf.org/rfc/rfc4627.txt
A JSON text is a sequence of tokens.
The set of tokens includes six
structural characters, strings,
numbers, and three literal names.
A JSON text is a serialized object or array.
Update (2014)
As of March 2014, there is a new JSON RFC (7159) that modifies the definition slightly (see pages 4/5).
The definition per RFC 4627 was: JSON-text = object / array
This has been changed in RFC 7159 to: JSON-text = ws value ws
Where ws represents whitespace and value is defined as follows:
A JSON value MUST be an object, array, number, or string, or one of
the following three literal names:
false null true
So, the answer to the question is still yes, JSON text can start with a square bracket (i.e. an array). But in addition to objects and arrays, it can now also be a number, string or the values false, null or true.
Also, this has changed from my previous RFC 4627 quote (emphasis added):
A JSON text is a sequence of tokens. The set of tokens includes six
structural characters, strings, numbers, and three literal names.
A JSON text is a serialized value. Note that certain previous
specifications of JSON constrained a JSON text to be an object or an
array. Implementations that generate only objects or arrays where a
JSON text is called for will be interoperable in the sense that all
implementations will accept these as conforming JSON texts.
If the string you are parsing begins with a left brace ([) you can use JSONArray.parse to get back a JSONArray object and then you can use get(i) where i is an index from 0 through the returned JSONArray's size()-1.
import java.io.IOException;
import com.ibm.json.java.JSONArray;
import com.ibm.json.java.JSONObject;
public class BookListTest {
public static void main(String[] args) {
String jsonBookList = "{\"book_list\":{\"book\":[{\"title\":\"title 1\"},{\"title\":\"title 2\"}]}}";
Object book_list;
try {
book_list = JSONObject.parse(jsonBookList);
System.out.println(book_list);
Object bookList = JSONObject.parse(book_list.toString()).get("book_list");
System.out.println(bookList);
Object books = JSONObject.parse(bookList.toString()).get("book");
System.out.println(books);
JSONArray bookArray = JSONArray.parse(books.toString());
for (Object book : bookArray) {
System.out.println(book);
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Which produced output like:
{"book_list":{"book":[{"title":"title 1"},{"title":"title 2"}]}}
{"book":[{"title":"title 1"},{"title":"title 2"}]}
[{"title":"title 1"}, {"title":"title 2"}]
{"title":"title 1"}
{"title":"title 2"}
Note: if you attempted to call JSONObject.parse(books.toString()); you would get the error you encountered:
java.io.IOException: Expecting '{' on line 1, column 2 instead, obtained token: 'Token: ['
JSON.ORG WEBSITE SAYS ....
https://www.json.org/
The site clearly states the following:
JSON is built on two structures:
A collection of name/value pairs. In various languages, this is
realized as an object, record, struct, dictionary, hash table, keyed
list, or associative array.
An ordered list of values. In most languages, this is realized as
an array, vector, list, or sequence.
These are universal data structures. Virtually all modern programming languages support them in one form or another. It makes sense that a data format that is interchangeable with programming languages also be based on these structures.
In JSON, they take on these forms:
OBJECT:
An object is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma).
{string: value, string: value}
ARRAY:
An array is an ordered collection of values. An array begins with [ (left bracket) and ends with ] (right bracket). Values are separated by , (comma).
[value, value, value ….]
VALUE:
A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
STRING:
A string is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes. A character is represented as a single character string. A string is very much like a C or Java string.
NUMBER:
A number is very much like a C or Java number, except that the octal and hexadecimal formats are not used.
ABOUT WHITESPACE:
Whitespace can be inserted between any pair of tokens. Excepting a few encoding details, that completely describes the language.
Short answer is YES
In a .json file you can put Numbers (even just 10), Strings (even just "hello"), Booleans (true, false), Null (even just null), arrays and objects.
https://www.json.org/json-en.html
Using just Numbers, Strings, Booleans and Null are not logical because in .jon files we use more complicated structured data like arrays and object (mostly mix nested versions).
Below you can find a sample JSON data with array of object and start with "["
https://jsonplaceholder.typicode.com/posts