Ruby Read file without escaping the characters - json

File Content:
file.txt:
{ 'a' : 'b"c\g' }
Need to parse this JSON.
read_file = File.read(file.txt)
This read_file string is of the form : "{ 'a' : 'b\"c\\g'}\n".
While parsing the JSON:
JSON::ParserError: 757: unexpected token at '{ 'a' : 'b"c\g'}
from /usr/share/ruby/json/common.rb:155:in `parse'
from /usr/share/ruby/json/common.rb:155:in `parse'
from (irb):21
from /usr/bin/irb:12:in `<main>`
The file could contain any escaping sequence or wild-cards, but it will always be in JSON format.
How to parse such JSON file to ruby Hash?

This is because the json in your text file is invalid. Here is a similar question about parsing a json string using single quotes. For this to be valid json it needs
Double quotes
Escaped special characters
Without this you'll be unable to parse it as it won't be recognized as valid json. Try parsing this text instead:
{ "a": "b\"c\\g" } => {"a"=>"b\"c\g"}

Related

Invalid string: control characters in github action - jq

Good Afternoon all,
I am presenting the following problem.
new_ecdsa_config.json looks something like:
{
"certificate": "value long string
multine"
}
When I run the github action for reading the value
- name: Add new ECDSA to Organization
run: |
cat new_ecdsa_config.json | jq '.'
I get the following error:
parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 26, column 53
Error: Process completed with exit code 4.
Any ideas??
When you described the problem on github, you gave this as the example:
{
"certificate": "value long string
multine"
}
As the error message says, that is not valid JSON. (You can double-check this at jsonlint.com if you like.)
If you want the JSON representation of a multiline string, you'd have to escape the newline, e.g. along the lines of:
{
"certificate": "value long string\nmultiline"
}

How to load on Snowflake a JSON that has a literal unicode escape char "\\uNo"

I have the following JSON:
{
"name": "foo \\uNo bar"
}
I'm trying to load this into Snowflake using a STAGE on S3. This is in a CSV file like:
{"name": "foo \\uNo bar"}
However, when I try to load it, Snowflake breaks with an Error parsing JSON message. If I try to load it directly on Snowflake console, as SELECT PARSE_JSON('{"name": "foo \\uNo bar"}'), I get:
Error parsing JSON: hex digit is expected in \u???? escape sequence, pos 17
The problem is that Snowflake is parsing the string, checking for an unicode digit \uNo (which doesn't exist). How can I disable this?
The default FILE FORMAT for parsing CSVs in Snowflake is interpreting the double backslash string '{"name": "foo \\uNo bar"}' as an escape sequence for the character \ which means that the character sequence \uNo is getting passed to PARSE_JSON which then fails because \uNo not a valid escape sequence for a JSON string. You can prevent this by overriding the FILE FORMAT escape sequence settings.
Given this CSV file:
JSON
'{"name": "foo \\uNo bar"}'
And the following CREATE TABLE and COPY INTO statements:
CREATE OR REPLACE TABLE JSON_TEST (JSON TEXT);
COPY INTO JSON_TEST
FROM #my_db.public.my_s3_stage/json.csv
FILE_FORMAT = (TYPE = CSV
SKIP_HEADER = 1
FIELD_OPTIONALLY_ENCLOSED_BY = '\''
ESCAPE = NONE
ESCAPE_UNENCLOSED_FIELD = NONE);
I am able to parse there result as JSON:
SELECT PARSE_JSON(JSON) FROM JSON_TEST;
Which returns
+-----------------------------+
| JSON |
+-----------------------------|
| { "name": "foo \\uNo bar" } |
+-----------------------------+

Why doesn't Elasticsearch Ingest accept a grok pattern that Logstash does?

I have the following grok pattern that works in Logstash and in the Grok debugger in Kibana.
\[%{TIMESTAMP_ISO8601:req_time}\] %{IP:client_ip} (?:%{IP:forwarded_for}|\(-\)) (?:%{QS:request}|-) %{NUMBER:response_code:int} %{WORD}:%{NUMBER:request_length:int} %{WORD}:%{NUMBER:body_bytes_sent:int} %{WORD}:(?:%{QS:http_referer}|-) %{WORD}:(?:%{QS:http_user_agent}|-) (%{WORD}:(\")?(%{NUMBER:request_time:float})(\")?)?"
I am trying to create a new ingest pipeline via the PUT method, but I get an error that contains:
"type": "parse_exception",
"reason": "Failed to parse content to map",
"caused_by": {
"type": "i_o_exception",
"reason": "Unrecognized character escape '[' (code 91)\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper#61326735; line: 7, column: 25]"
}
Elasticsearch requires that grok patterns used in pipelines submitted using the PUT method are properly escaped JSON, while Logstash patterns use different escaping.
That includes preceding brackets with double backslashes (\\[) and double quotes with triple backslashes (\\\"). The working pattern (after running through a JSON escaping tool) is:
\\[%{TIMESTAMP_ISO8601:req_time}\\] %{IP:client_ip} (?:%{IP:forwarded_for}|\\(-\\)) (?:%{QS:request}|-) %{NUMBER:response_code:int} %{WORD}:%{NUMBER:request_length:int} %{WORD}:%{NUMBER:body_bytes_sent:int} %{WORD}:(?:%{QS:http_referer}|-) %{WORD}:(?:%{QS:http_user_agent}|-) (%{WORD}:(\\\")?(%{NUMBER:request_time:float})(\\\")?)?

Parse json with newline

Given a Json like this:
{
"description": "foo \n bar"
}
If I try to create a Hash(String, String) using method from_json I get the following error:
Unexpected char '
' at 1:22 (JSON::ParseException)
How can I correctly parse it?
require "json"
Hash(String, String).from_json(%({"description": "foo \n bar"}))
https://play.crystal-lang.org/#/r/3ynh

Error in parsing json Expecting '{', '['**

This is the json .
"{
'places': [
{
'name': 'New\x20Orleans,
\x20US\x20\x28New\x20Lakefront\x20\x2D\x20NEW\x29',
'code': 'NEW'
}
]
}"
I am getting json parsererror. I am checking on http://jsonlint.com/ and it shows following error
Parse error on line 1:
"{ 'places': [
^
Expecting '{', '['
Please explain what are the problems with the json and do I correct it?
If you literally mean that the string, as a whole, is your JSON text (containing something that isn't JSON), there are three issues:
It's just a JSON fragment, not a full JSON document.
Literal line breaks within strings are not valid in JSON, use \n.
\x is an invalid escape sequence in JSON strings. If you want your contained non-JSON text to have a \x escape (e.g., when you read the value of the overall string and parse it), you have to escape that backslash: \\x.
In a full JSON document, the top level must be an object or array:
{"prop": "value"}
[1, 2, 3]
Most JSON parsers support parsing fragments, such as standalone strings. (For instance, JavaScript's JSON.parse supports this.) http://jsonlint.com is doing full document parsing, however.
Here's your fragment wrapped in an object with the line breaks and \x issue handled:
{
"stuff": "{\n 'places': [\n {\n 'name': 'New\\x20Orleans,\n \\x20US\\x20\\x28New\\x20Lakefront\\x20\\x2D\\x20NEW\\x29',\n 'code': 'NEW'\n }\n \n ]\n }"
}
The text within the string is also not valid JSON, but perhaps it's not meant to be. For completeness: JSON requires that all keys and strings be in double quotes ("), not single quotes ('). It also doesn't allow literal line breaks within string literals (use \n instead), and doesn't support \x escapes. See http://json.org for details.
Here's a version as valid JSON with the \x converted to the correct JSON \u escape:
{
"places": [
{
"name": "New\u0020Orleans,\n\u0020US\u0020\u0028New\u0020Lakefront\u0020\u002D\u0020NEW\u0029",
"code": "NEW"
}
]
}
...also those escapes are all actually defining perfectly normal characters, so:
{
"places": [
{
"name": "New Orleans,\n US (New Lakefront - NEW)",
"code": "NEW"
}
]
}
read http://json.org/
{
"places": [
{
"name": "New\\x20Orleans,\\x20US\\x20\\x28New\\x20Lakefront\\x20\\x2D\\x20NEW\\x29",
"code": "NEW"
}
]
}