Error in parsing json Expecting '{', '['** - json

This is the json .
"{
'places': [
{
'name': 'New\x20Orleans,
\x20US\x20\x28New\x20Lakefront\x20\x2D\x20NEW\x29',
'code': 'NEW'
}
]
}"
I am getting json parsererror. I am checking on http://jsonlint.com/ and it shows following error
Parse error on line 1:
"{ 'places': [
^
Expecting '{', '['
Please explain what are the problems with the json and do I correct it?

If you literally mean that the string, as a whole, is your JSON text (containing something that isn't JSON), there are three issues:
It's just a JSON fragment, not a full JSON document.
Literal line breaks within strings are not valid in JSON, use \n.
\x is an invalid escape sequence in JSON strings. If you want your contained non-JSON text to have a \x escape (e.g., when you read the value of the overall string and parse it), you have to escape that backslash: \\x.
In a full JSON document, the top level must be an object or array:
{"prop": "value"}
[1, 2, 3]
Most JSON parsers support parsing fragments, such as standalone strings. (For instance, JavaScript's JSON.parse supports this.) http://jsonlint.com is doing full document parsing, however.
Here's your fragment wrapped in an object with the line breaks and \x issue handled:
{
"stuff": "{\n 'places': [\n {\n 'name': 'New\\x20Orleans,\n \\x20US\\x20\\x28New\\x20Lakefront\\x20\\x2D\\x20NEW\\x29',\n 'code': 'NEW'\n }\n \n ]\n }"
}
The text within the string is also not valid JSON, but perhaps it's not meant to be. For completeness: JSON requires that all keys and strings be in double quotes ("), not single quotes ('). It also doesn't allow literal line breaks within string literals (use \n instead), and doesn't support \x escapes. See http://json.org for details.
Here's a version as valid JSON with the \x converted to the correct JSON \u escape:
{
"places": [
{
"name": "New\u0020Orleans,\n\u0020US\u0020\u0028New\u0020Lakefront\u0020\u002D\u0020NEW\u0029",
"code": "NEW"
}
]
}
...also those escapes are all actually defining perfectly normal characters, so:
{
"places": [
{
"name": "New Orleans,\n US (New Lakefront - NEW)",
"code": "NEW"
}
]
}

read http://json.org/
{
"places": [
{
"name": "New\\x20Orleans,\\x20US\\x20\\x28New\\x20Lakefront\\x20\\x2D\\x20NEW\\x29",
"code": "NEW"
}
]
}

Related

Ruby Read file without escaping the characters

File Content:
file.txt:
{ 'a' : 'b"c\g' }
Need to parse this JSON.
read_file = File.read(file.txt)
This read_file string is of the form : "{ 'a' : 'b\"c\\g'}\n".
While parsing the JSON:
JSON::ParserError: 757: unexpected token at '{ 'a' : 'b"c\g'}
from /usr/share/ruby/json/common.rb:155:in `parse'
from /usr/share/ruby/json/common.rb:155:in `parse'
from (irb):21
from /usr/bin/irb:12:in `<main>`
The file could contain any escaping sequence or wild-cards, but it will always be in JSON format.
How to parse such JSON file to ruby Hash?
This is because the json in your text file is invalid. Here is a similar question about parsing a json string using single quotes. For this to be valid json it needs
Double quotes
Escaped special characters
Without this you'll be unable to parse it as it won't be recognized as valid json. Try parsing this text instead:
{ "a": "b\"c\\g" } => {"a"=>"b\"c\g"}

Pull value in jq with escaped vars

I have a JSON that I am trying to process.
I am using jq and can't for my life get the required output.
I have a simple eg below,
{
"message" :"{ \"foo\": \"42\", \"bar\": \"less interesting data\"}"
}
My Build Up
jq '."message"
{
"message" :{"foo": "42", "bar": "less interesting data"}
}
gives
{
"foo": "42",
"bar": "less interesting data"
}
."message"."bar"
gives
"less interesting data"
So
{
"message" :"{"foo": "42", "bar": "less interesting data"}"
}
FAILS as JSON invalid
{
"message" :"{\"foo\": \"42\", \"bar\": \"less interesting data\"}"
}
FAILS 'jq: error (at :3): Cannot index string with string "bar"
exit status 5'
I have tried a whole bunch of differing jq queries (i won't waste your time listing them)
So I would like some advice on how id get "bar" from the JSON
It's not a duplicate of convert string to JSON as this leads you to the idea of conversion. Without this question, you'd never know the answer is to use fromjson
Use the fromjson construct to restore the strings as JSON texts. So, given the content below
{
"message": "{ \"foo\": \"42\", \"bar\": \"less interesting data\" }"
}
all you need to do to extract bar is
jq '."message"|fromjson|.bar' file
"less interesting data"
To print the output without the quotes, use the -r/--raw-ouput flag which emits text in raw format. As noted in the comments fromjson.bar should also work as expected.

Why doesn't Elasticsearch Ingest accept a grok pattern that Logstash does?

I have the following grok pattern that works in Logstash and in the Grok debugger in Kibana.
\[%{TIMESTAMP_ISO8601:req_time}\] %{IP:client_ip} (?:%{IP:forwarded_for}|\(-\)) (?:%{QS:request}|-) %{NUMBER:response_code:int} %{WORD}:%{NUMBER:request_length:int} %{WORD}:%{NUMBER:body_bytes_sent:int} %{WORD}:(?:%{QS:http_referer}|-) %{WORD}:(?:%{QS:http_user_agent}|-) (%{WORD}:(\")?(%{NUMBER:request_time:float})(\")?)?"
I am trying to create a new ingest pipeline via the PUT method, but I get an error that contains:
"type": "parse_exception",
"reason": "Failed to parse content to map",
"caused_by": {
"type": "i_o_exception",
"reason": "Unrecognized character escape '[' (code 91)\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper#61326735; line: 7, column: 25]"
}
Elasticsearch requires that grok patterns used in pipelines submitted using the PUT method are properly escaped JSON, while Logstash patterns use different escaping.
That includes preceding brackets with double backslashes (\\[) and double quotes with triple backslashes (\\\"). The working pattern (after running through a JSON escaping tool) is:
\\[%{TIMESTAMP_ISO8601:req_time}\\] %{IP:client_ip} (?:%{IP:forwarded_for}|\\(-\\)) (?:%{QS:request}|-) %{NUMBER:response_code:int} %{WORD}:%{NUMBER:request_length:int} %{WORD}:%{NUMBER:body_bytes_sent:int} %{WORD}:(?:%{QS:http_referer}|-) %{WORD}:(?:%{QS:http_user_agent}|-) (%{WORD}:(\\\")?(%{NUMBER:request_time:float})(\\\")?)?

jq construct with value strings spanning multiple lines

I am trying to form a JSON construct using jq that should ideally look like below:-
{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": [
"event"
],
"traffic_including": [
"unattributed_traffic"
],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
"attribution_campaign_id",
"attribution_creative",
"attribution_date_adjusted",
"attribution_date_utc",
"attribution_matched_by",
"attribution_matched_to",
"attribution_network",
"attribution_network_id",
"attribution_seconds_since",
"attribution_site_id",
"attribution_site_id",
"attribution_tier",
"attribution_timestamp",
"attribution_timestamp_adjusted",
"attribution_tracker",
"attribution_tracker_id",
"attribution_tracker_name",
"count",
"custom_dimensions",
"device_id_adid",
"device_id_android_id",
"device_id_custom",
"device_id_idfa",
"device_id_idfv",
"device_id_kochava",
"device_os",
"device_type",
"device_version",
"dimension_count",
"dimension_data",
"dimension_sum",
"event_name",
"event_time_registered",
"geo_city",
"geo_country",
"geo_lat",
"geo_lon",
"geo_region",
"identity_link",
"install_date_adjusted",
"install_date_utc",
"install_device_version",
"install_devices_adid",
"install_devices_android_id",
"install_devices_custom",
"install_devices_email_0",
"install_devices_email_1",
"install_devices_idfa",
"install_devices_ids",
"install_devices_ip",
"install_devices_waid",
"install_matched_by",
"install_matched_on",
"install_receipt_status",
"install_san_original",
"install_status",
"request_ip",
"request_ua",
"timestamp_adjusted",
"timestamp_utc"
]
}
What I have tried unsuccessfully thus far is below:-
json_construct=$(cat <<EOF
{
"api_key": "6AEC90B5-4169-59AF-7AC9-D655F83B4825",
"app_guid": "komacca-s-rewards-app-au-ios-production-cv8tx71",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv"
"columns_order": ["attribution_attribution_action","attribution_campaign","attribution_campaign_id","attribution_creative","attribution_date_adjusted","attribution_date_utc","attribution_matched_by","attribution_matched_to","attributio
network","attribution_network_id","attribution_seconds_since","attribution_site_id","attribution_tier","attribution_timestamp","attribution_timestamp_adjusted","attribution_tracker","attribution_tracker_id","attribution_tracker_name","
unt","custom_dimensions","device_id_adid","device_id_android_id","device_id_custom","device_id_idfa","device_id_idfv","device_id_kochava","device_os","device_type","device_version","dimension_count","dimension_data","dimension_sum","ev
t_name","event_time_registered","geo_city","geo_country","geo_lat","geo_lon","geo_region","identity_link","install_date_adjusted","install_date_utc","install_device_version","install_devices_adid","install_devices_android_id","install_
vices_custom","install_devices_email_0","install_devices_email_1","install_devices_idfa","install_devices_ids","install_devices_ip","install_devices_waid","install_matched_by","install_matched_on","install_receipt_status","install_san_
iginal","install_status","request_ip","request_ua","timestamp_adjusted","timestamp_utc"]
}
EOF)
followed by:-
echo "$json_construct" | jq '.'
I get the following error:-
parse error: Expected separator between values at line 10, column 15
I am guessing it is because of the string literal which spans to multiple lines that jq is unable to parse it.
Use jq itself:
my_formatted_json=$(jq -n '{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
...,
"timestamp_utc"
]
}')
Your input "JSON" is not valid JSON, as indicated by the error message.
The first error is that a comma is missing after the key/value pair: "delivery_format": "csv", but there are others -- notably, JSON strings cannot be split across lines. Once you fix the key/value pair problem and the JSON strings that are split incorrectly, jq . will work with your text. (Note that once your input is corrected, the longest JSON string is quite short -- 50 characters or so -- whereas jq has no problems processing strings of length 10^8 quite speedily ...)
Generally, jq is rather permissive when it comes to JSON-like input, but if you're ever in doubt, it would make sense to use a validator such as the online validator at jsonlint.com
By the way, the jq FAQ does suggest various ways for handling input that isn't strictly JSON -- see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json
Along the lines of chepner's suggestion since jq can read raw text data you could just use a jq filter to generate a legal json object from your script variables. For example:
#!/bin/bash
# whatever logic you have to obtain bash variables goes here
key=XXXXXXXXXX-7AC9-D655F83B4825
guid=XXXXXXXXXXXXXX
# now use jq filter to read raw text and construct legal json object
json_construct=$(jq -MRn '[inputs]|map(split(" ")|{(.[0]):.[1]})|add' <<EOF
api_key $key
app_guid $guid
EOF)
echo $json_construct
Sample Run (assumes executable script is in script.sh)
$ ./script.sh
{ "api_key": "XXXXXXXXXX-7AC9-D655F83B4825", "app_guid": "XXXXXXXXXXXXXX" }
Try it online!

How to get a formatted json string from a json object?

I'm storing the output of cat ~/path/to/file/blah | jq tojson in a variable to be used later in a curl POST with JSON content. It works well, but it removes all line breaks. I understand line breaks are not supported in JSON, but I'd like them to be replaced with \n characters so when the data is used it isn't all one line.
Is there a way to do this?
Example:
{
"test": {
"name": "test",
"description": "blah"
},
"test2": {
"name": "test2",
"description": "blah2"
}
}
becomes
"{\"test\":{\"name\":\"test\",\"description\":\"blah\"},\"test2\":{\"name\":\"test2\",\"description\":\"blah2\"}}"
but I'd like it to look like
{\n \"test\": {\n \"name\": \"test\",\n \"description\": \"blah\"\n },\n \"test2\": {\n \"name\": \"test2\",\n \"description\": \"blah2\" \n }\n}
I'm actually only converting it to a JSON string so it is able to be posted as part of another JSON. When is it posted, I'd like it to have the format it had originally which can be achieved if there are \n characters.
I can do this manually by doing
cat file | sed -E ':a;N;$!ba;s/\r{0,1}\n/\\n/g' | sed 's/\"/\\"/g')
but this is not ideal.
tojson (or other json outputting filters) will not format the json. It will take on the usual compact form. There is a feature request out there for this so look out for that in a future version.
You could take advantage of jq's regular formatted output, but you'll want to stringify it. You could simulate stringifying by slurping in as raw input, the formatted output. This will read in all of the input as a single string. And since the input was just a json object, it'll produce a string representation of that object.
If you don't mind the extra jq calls, you could do this:
$ var=$(jq '.' input.json | jq -sR '.')
$ echo "$var"
"{\n \"test\": {\n \"name\": \"test\",\n \"description\": \"blah\"\n },\n \"test2\": {\n \"name\": \"test2\",\n \"description\": \"blah2\"\n }\n}\n"
Then of course if your input is already formatted, you could leave out the first jq call.
If your input contains only one JSON value, then jq isn't really buying you much here: all you need is to escape the few characters that are valid in JSON but that don't represent themselves in JSON strings, and you can easily do that using command-line utilities for general-purpose string processing.
For example:
perl -wpe '
s/\\/\\<backslash>/g;
s/\t/\\t/g;
s/\n/\\n/g;
s/\r/\\r/g;
s/"/\\"/g;
s/\\<backslash>/\\\\/g
' ~/path/to/file/blah