Ruby String Automatically Escaped - json

I am pulling JSON data from a website and am trying to output it within another JSON array.
The data is being pulled properly, but when I shove it into the array of my own, the data gets escaped...
Example:
"{\n \"company\": {\n \"name\": \"SomeName\", \n \"searching\": false, \n \"status\": \"LEAD\"\n}"
How can I prevent the \ and \n from inserting themselves everywhere?

Related

Ruby Read file without escaping the characters

File Content:
file.txt:
{ 'a' : 'b"c\g' }
Need to parse this JSON.
read_file = File.read(file.txt)
This read_file string is of the form : "{ 'a' : 'b\"c\\g'}\n".
While parsing the JSON:
JSON::ParserError: 757: unexpected token at '{ 'a' : 'b"c\g'}
from /usr/share/ruby/json/common.rb:155:in `parse'
from /usr/share/ruby/json/common.rb:155:in `parse'
from (irb):21
from /usr/bin/irb:12:in `<main>`
The file could contain any escaping sequence or wild-cards, but it will always be in JSON format.
How to parse such JSON file to ruby Hash?
This is because the json in your text file is invalid. Here is a similar question about parsing a json string using single quotes. For this to be valid json it needs
Double quotes
Escaped special characters
Without this you'll be unable to parse it as it won't be recognized as valid json. Try parsing this text instead:
{ "a": "b\"c\\g" } => {"a"=>"b\"c\g"}

Nifi CSVSetRecordWriter problems

I try to transform json to csv using UpdateRecord processor with updating record.
My input json can contains fields with | and ". In the output csv file I want to use | as field delimeter and " for quotiong all fields.
So, example for input json file:
{"field1": "value1",
"id": "11112",
"desc1": "description Text \""
}
I get the following result:
"value1"|"11112"|"description Text ""
Expected result:
"value1"|"11112"|"description Text \""
Property of CSVRecordSetWriter:
What can I do to add a \ before the quotes in the values
Also I have a problem if \ in field. (such as "example str\"). I expect example str\\", but this is not true.

jq construct with value strings spanning multiple lines

I am trying to form a JSON construct using jq that should ideally look like below:-
{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": [
"event"
],
"traffic_including": [
"unattributed_traffic"
],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
"attribution_campaign_id",
"attribution_creative",
"attribution_date_adjusted",
"attribution_date_utc",
"attribution_matched_by",
"attribution_matched_to",
"attribution_network",
"attribution_network_id",
"attribution_seconds_since",
"attribution_site_id",
"attribution_site_id",
"attribution_tier",
"attribution_timestamp",
"attribution_timestamp_adjusted",
"attribution_tracker",
"attribution_tracker_id",
"attribution_tracker_name",
"count",
"custom_dimensions",
"device_id_adid",
"device_id_android_id",
"device_id_custom",
"device_id_idfa",
"device_id_idfv",
"device_id_kochava",
"device_os",
"device_type",
"device_version",
"dimension_count",
"dimension_data",
"dimension_sum",
"event_name",
"event_time_registered",
"geo_city",
"geo_country",
"geo_lat",
"geo_lon",
"geo_region",
"identity_link",
"install_date_adjusted",
"install_date_utc",
"install_device_version",
"install_devices_adid",
"install_devices_android_id",
"install_devices_custom",
"install_devices_email_0",
"install_devices_email_1",
"install_devices_idfa",
"install_devices_ids",
"install_devices_ip",
"install_devices_waid",
"install_matched_by",
"install_matched_on",
"install_receipt_status",
"install_san_original",
"install_status",
"request_ip",
"request_ua",
"timestamp_adjusted",
"timestamp_utc"
]
}
What I have tried unsuccessfully thus far is below:-
json_construct=$(cat <<EOF
{
"api_key": "6AEC90B5-4169-59AF-7AC9-D655F83B4825",
"app_guid": "komacca-s-rewards-app-au-ios-production-cv8tx71",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv"
"columns_order": ["attribution_attribution_action","attribution_campaign","attribution_campaign_id","attribution_creative","attribution_date_adjusted","attribution_date_utc","attribution_matched_by","attribution_matched_to","attributio
network","attribution_network_id","attribution_seconds_since","attribution_site_id","attribution_tier","attribution_timestamp","attribution_timestamp_adjusted","attribution_tracker","attribution_tracker_id","attribution_tracker_name","
unt","custom_dimensions","device_id_adid","device_id_android_id","device_id_custom","device_id_idfa","device_id_idfv","device_id_kochava","device_os","device_type","device_version","dimension_count","dimension_data","dimension_sum","ev
t_name","event_time_registered","geo_city","geo_country","geo_lat","geo_lon","geo_region","identity_link","install_date_adjusted","install_date_utc","install_device_version","install_devices_adid","install_devices_android_id","install_
vices_custom","install_devices_email_0","install_devices_email_1","install_devices_idfa","install_devices_ids","install_devices_ip","install_devices_waid","install_matched_by","install_matched_on","install_receipt_status","install_san_
iginal","install_status","request_ip","request_ua","timestamp_adjusted","timestamp_utc"]
}
EOF)
followed by:-
echo "$json_construct" | jq '.'
I get the following error:-
parse error: Expected separator between values at line 10, column 15
I am guessing it is because of the string literal which spans to multiple lines that jq is unable to parse it.
Use jq itself:
my_formatted_json=$(jq -n '{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
...,
"timestamp_utc"
]
}')
Your input "JSON" is not valid JSON, as indicated by the error message.
The first error is that a comma is missing after the key/value pair: "delivery_format": "csv", but there are others -- notably, JSON strings cannot be split across lines. Once you fix the key/value pair problem and the JSON strings that are split incorrectly, jq . will work with your text. (Note that once your input is corrected, the longest JSON string is quite short -- 50 characters or so -- whereas jq has no problems processing strings of length 10^8 quite speedily ...)
Generally, jq is rather permissive when it comes to JSON-like input, but if you're ever in doubt, it would make sense to use a validator such as the online validator at jsonlint.com
By the way, the jq FAQ does suggest various ways for handling input that isn't strictly JSON -- see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json
Along the lines of chepner's suggestion since jq can read raw text data you could just use a jq filter to generate a legal json object from your script variables. For example:
#!/bin/bash
# whatever logic you have to obtain bash variables goes here
key=XXXXXXXXXX-7AC9-D655F83B4825
guid=XXXXXXXXXXXXXX
# now use jq filter to read raw text and construct legal json object
json_construct=$(jq -MRn '[inputs]|map(split(" ")|{(.[0]):.[1]})|add' <<EOF
api_key $key
app_guid $guid
EOF)
echo $json_construct
Sample Run (assumes executable script is in script.sh)
$ ./script.sh
{ "api_key": "XXXXXXXXXX-7AC9-D655F83B4825", "app_guid": "XXXXXXXXXXXXXX" }
Try it online!

How to get a formatted json string from a json object?

I'm storing the output of cat ~/path/to/file/blah | jq tojson in a variable to be used later in a curl POST with JSON content. It works well, but it removes all line breaks. I understand line breaks are not supported in JSON, but I'd like them to be replaced with \n characters so when the data is used it isn't all one line.
Is there a way to do this?
Example:
{
"test": {
"name": "test",
"description": "blah"
},
"test2": {
"name": "test2",
"description": "blah2"
}
}
becomes
"{\"test\":{\"name\":\"test\",\"description\":\"blah\"},\"test2\":{\"name\":\"test2\",\"description\":\"blah2\"}}"
but I'd like it to look like
{\n \"test\": {\n \"name\": \"test\",\n \"description\": \"blah\"\n },\n \"test2\": {\n \"name\": \"test2\",\n \"description\": \"blah2\" \n }\n}
I'm actually only converting it to a JSON string so it is able to be posted as part of another JSON. When is it posted, I'd like it to have the format it had originally which can be achieved if there are \n characters.
I can do this manually by doing
cat file | sed -E ':a;N;$!ba;s/\r{0,1}\n/\\n/g' | sed 's/\"/\\"/g')
but this is not ideal.
tojson (or other json outputting filters) will not format the json. It will take on the usual compact form. There is a feature request out there for this so look out for that in a future version.
You could take advantage of jq's regular formatted output, but you'll want to stringify it. You could simulate stringifying by slurping in as raw input, the formatted output. This will read in all of the input as a single string. And since the input was just a json object, it'll produce a string representation of that object.
If you don't mind the extra jq calls, you could do this:
$ var=$(jq '.' input.json | jq -sR '.')
$ echo "$var"
"{\n \"test\": {\n \"name\": \"test\",\n \"description\": \"blah\"\n },\n \"test2\": {\n \"name\": \"test2\",\n \"description\": \"blah2\"\n }\n}\n"
Then of course if your input is already formatted, you could leave out the first jq call.
If your input contains only one JSON value, then jq isn't really buying you much here: all you need is to escape the few characters that are valid in JSON but that don't represent themselves in JSON strings, and you can easily do that using command-line utilities for general-purpose string processing.
For example:
perl -wpe '
s/\\/\\<backslash>/g;
s/\t/\\t/g;
s/\n/\\n/g;
s/\r/\\r/g;
s/"/\\"/g;
s/\\<backslash>/\\\\/g
' ~/path/to/file/blah

Error in parsing json Expecting '{', '['**

This is the json .
"{
'places': [
{
'name': 'New\x20Orleans,
\x20US\x20\x28New\x20Lakefront\x20\x2D\x20NEW\x29',
'code': 'NEW'
}
]
}"
I am getting json parsererror. I am checking on http://jsonlint.com/ and it shows following error
Parse error on line 1:
"{ 'places': [
^
Expecting '{', '['
Please explain what are the problems with the json and do I correct it?
If you literally mean that the string, as a whole, is your JSON text (containing something that isn't JSON), there are three issues:
It's just a JSON fragment, not a full JSON document.
Literal line breaks within strings are not valid in JSON, use \n.
\x is an invalid escape sequence in JSON strings. If you want your contained non-JSON text to have a \x escape (e.g., when you read the value of the overall string and parse it), you have to escape that backslash: \\x.
In a full JSON document, the top level must be an object or array:
{"prop": "value"}
[1, 2, 3]
Most JSON parsers support parsing fragments, such as standalone strings. (For instance, JavaScript's JSON.parse supports this.) http://jsonlint.com is doing full document parsing, however.
Here's your fragment wrapped in an object with the line breaks and \x issue handled:
{
"stuff": "{\n 'places': [\n {\n 'name': 'New\\x20Orleans,\n \\x20US\\x20\\x28New\\x20Lakefront\\x20\\x2D\\x20NEW\\x29',\n 'code': 'NEW'\n }\n \n ]\n }"
}
The text within the string is also not valid JSON, but perhaps it's not meant to be. For completeness: JSON requires that all keys and strings be in double quotes ("), not single quotes ('). It also doesn't allow literal line breaks within string literals (use \n instead), and doesn't support \x escapes. See http://json.org for details.
Here's a version as valid JSON with the \x converted to the correct JSON \u escape:
{
"places": [
{
"name": "New\u0020Orleans,\n\u0020US\u0020\u0028New\u0020Lakefront\u0020\u002D\u0020NEW\u0029",
"code": "NEW"
}
]
}
...also those escapes are all actually defining perfectly normal characters, so:
{
"places": [
{
"name": "New Orleans,\n US (New Lakefront - NEW)",
"code": "NEW"
}
]
}
read http://json.org/
{
"places": [
{
"name": "New\\x20Orleans,\\x20US\\x20\\x28New\\x20Lakefront\\x20\\x2D\\x20NEW\\x29",
"code": "NEW"
}
]
}