We are loading data into Snowflake using a JavaScript procedure.
The script will loop over an array of objects to load some data. These objects contain string that may have special characters.
i.e.:
"Description": "This file contain "sensitive" information."
The double quotes on sensitive word will become:
"Description": "This file contain \"sensitive\" information."
Which broke the loading script.
The same issue happened when we used HTML tags within description key:
"Description": "Please use <b>specific fonts</b> to update the file".
This is another example on the Snowflake community site.
Also this post recommended setting FIELD_OPTIONALLY_ENCLOSED_BY equal to the special characters, but I am handling large data set which might have all the special characters.
How can we escape special characters automatically without updating the script and use JavaScript to loop over the whole array to anticipate and replace each special character with something else?
EDIT
I tried using JSON_EXTRACT_PATH_TEXT:
select JSON_EXTRACT_PATH_TEXT(parse_json('{
"description": "Please use \"Custom\" fonts"
}'), 'description');
and got the following error:
Error parsing JSON: missing comma, line 2, pos 33.
I think the escape characters generated by the JS procedure are escaped when passing to SQL functions.
'{"description": "Please use \"Custom\" fonts"}'
becomes
'{"description": "Please use "Custom" fonts"}'
Therefore parsing them as JSON/fetching a field from JSON fails. To avoid error, the JavaScript procedure should generate a double backslash instead of a backslash:
'{"description": "Please use \\"Custom\\" fonts"}'
I do not think there is a way to prevent this error without modifying the JavaScript procedure.
I came across this today, Gokhan is right you need the double backslashes to properly escape the quote.
Here are a couple links that explain it a little more:
https://community.snowflake.com/s/article/Escaping-new-line-character-in-JSON-to-avoid-data-loading-errors
https://community.snowflake.com/s/article/Unable-to-Insert-Data-Containing-Back-Slash-from-Stored-Procedure
For my case I found that I could address this challenge by disabling the escaping and then manually replacing the using replace function.
For your example the replace is not necessary.
select parse_json($${"description": "Please use \"Custom\" fonts"}$$);
select parse_json($${"description": "Please use \"Custom\" fonts"}$$):description;
I'm making a dialogue system in gdscript and am struggling with escape characters, specifically '\n'.
I'm using CastleDB as, although not perfect, it has allowed me to have almost everything stored in data and will allow the person doing the writing for the game to do everything outside the engine, without me having to copy and paste stuff in.
I've hit a stumbling block with escape characters. A single text entry in CastleDB doesn't support spaces, and '\n' within the string prints to '\n', not a space, in the dialog box.
I've tried using the format string function with 'some text here {space} some more text', with the space referencing a string consisting of just \n. This still prints \n. If I feed some constant string with \n in the middle directly into the function which displays the dialog text, it adds a space so I'm not really sure what is going on here.
I don't have a computer science background (I've done some C up until pointers, at which point I decided to return later).
Is there something going on in the background with my string in gdscript? It prints out just like you would expect a string to, apart from ignoring my escape characters.
Could it be something to do with the fact that it comes in as a JSON? As far as I'm aware, even if a string is chopped up and reassembled, it should still just behave like a string...?!
Anyway, I haven't included any code because I don't know what code you'd need to see. I'm hoping it's something simple that because I'm teaching myself as I go I just wasn't aware of, but can post code if it helps.
Thanks,
James
Escape sequences are a way of getting around issues with syntax. When you type a string in most programming languages, it starts with " and ends with another ". And it needs to stay on one line. Simple, right?
What if you want to put an actual " in your string? Or a new line? We need some way of telling the compiler, "hey, we want to insert a newline here, even though we can't use an actual newline character". So we use a \ to let the compiler know that the next character is part of an escape sequence.
This causes another problem: What if we literally want to put a backslash in a string? That's where the double backslash comes from: \\ is the escape sequence for \, since \ by itself has a special meaning.
But CastleDB (apparently, I'm not familiar with it) doesn't recognize escape sequences, so when you type \n it thinks you literally want \ followed by n. When it converts this to JSON, it inserts the \\ because JSON does recognize escape sequences.
GDScript also recognizes escape sequences, so print("Hello\nworld!") prints
Hello
world!
You could try input_string.replace("\\n", "\n") to replace the \n escape sequences.
I've solved this by looking at the way CastleDB data is stored on the project's github page.
For some reason "\n" was stored as "\\n" behind the scenes. Now that I know why it was printing weirdly I can change it, even though it feels like a messy solution!
To add even more weirdness to this whole backslash business, stack overflow displays a double backslash as a single backslash so I have to write \ \ \n minus the spaces to get \\n...
I'm sure there must be a reason, but it eludes me.
I have the following two lines of code:
json_str = _cases.to_json
path += " #{USER} #{PASS} #{json_str}"
When I use the debugger, I noticed that json_str appears to be formatted as JSON:
"[["FMCE","Wiltone","Wiltone","04/10/2018","Marriage + - DOM"]]"
However, when I interpolate it into another string, the quotes are removed:
"node superuser 123456 [["FMCE","Wiltone","Wiltone","04/10/2018","Marriage + - DOM"]]"
Why does string interpolation remove the quotes from JSON string and how can I resolve this?
I did find one solution to the problem, which was manually escaping the string:
json_str = _cases.to_json.gsub('"','\"')
path += " #{USER} #{PASS} \"#{json_str}\""
So basically I escape the double quotes generated in the to_json call. Then I manually add two escaped quotes around the interpolated variable. This will produce a desired result:
node superuser 123456 "[[\"FMCE\",\"Wiltone\",\"Wiltone\",\"04/10/2018\",\"Marriage + - DOM\"]]"
Notice how the outer quotes around the collection are not escaped, but the strings inside the collection are escaped. That will enable JavaScript to parse it with JSON.parse.
It is important to note that in this part:
json_str = _cases.to_json.gsub('"','\"')
it is adding a LITERAL backslash. Not an escape sequence.
But in this part:
path += " #{USER} #{PASS} \"#{json_str}\""
The \" wrapping the interpolated variable is an escape sequence and NOT a literal backslash.
Why do you think the first and last quote marks are part of the string? They do not belong to the JSON format. Your program’s behavior looks correct to me.
(Or more precisely, your program seems to be doing exactly what you told it to. Whether your instructions are any good is a question I can’t answer without more context.)
It's hard to tell with the small sample, but it looks like you might be getting quotes from your debugger output. assuming the output of .to_json is a string (usually is), then "#{json_str}" should be exactly equal to json_str. If it isn't, that's a bug in ruby somehow (doubtful).
If you need the quotes, you need to either add them manually or escape the string using whatever escape function is appropriate for your use case. You could use .to_json as your escape function even ("#{json_str.to_json}", for example).
I have generated an HTML tag through C# code. I am able to render correctly in the text area. When I googled it, I found this. To render the HTML tags in the text area, we need to convert the '<','>' into HTML entity references. But when I am trying to replace using String.Replace, it throws an error: Too many characters in character literal
.
string psHtmlOutput="<html><body><table border='0' cellspacing='3' cellpadding='3'><tr><th> Name </th><th>DomainName</th><th>DomainType</th><th>Defualt</th></tr><tr><td>india.local</td><td>india.local</td><td>Authoritative</td><td>True</td></tr></table></body></html>";
psHtmlOutput.Replace('>','>');
psHtmlOutput.Replace('<','<');
Error: Too many characters in character literal
Please help; how can I proceed?
The String.Replace method has two overloads:
One that operates on Strings.
One that operates on Chars.
In C#, single quotation marks are used to specify Char literals. Because you have used single quotes, the second overload of the method has been used. However, your second argument is not a valid character literal because > is not a single character.
So if you actually want to replace the character with a string, just use the overload that takes strings:
psHtmlOutput.Replace(">", ">");
psHtmlOutput.Replace("<", "<");
I've got a two column CSV with a name and a number. Some people's name use commas, for example Joe Blow, CFA. This comma breaks the CSV format, since it's interpreted as a new column.
I've read up and the most common prescription seems to be replacing that character, or replacing the delimiter, with a new value (e.g. this|that|the, other).
I'd really like to keep the comma separator (I know excel supports other delimiters but other interpreters may not). I'd also like to keep the comma in the name, as Joe Blow| CFA looks pretty silly.
Is there a way to include commas in CSV columns without breaking the formatting, for example by escaping them?
To encode a field containing comma (,) or double-quote (") characters, enclose the field in double-quotes:
field1,"field, 2",field3, ...
Literal double-quote characters are typically represented by a pair of double-quotes (""). For example, a field exclusively containing one double-quote character is encoded as """".
For example:
Sheet: |Hello, World!|You "matter" to us.|
CSV: "Hello, World!","You ""matter"" to us."
More examples (sheet → csv):
regular_value → regular_value
Fresh, brown "eggs" → "Fresh, brown ""eggs"""
" → """"
"," → ""","""
,,," → ",,,"""
,"", → ","""","
""" → """"""""
See wikipedia.
I found that some applications like Numbers in Mac ignore the double quote if there is space before it.
a, "b,c" doesn't work while a,"b,c" works.
The problem with the CSV format, is there's not one spec, there are several accepted methods, with no way of distinguishing which should be used (for generate/interpret). I discussed all the methods to escape characters (newlines in that case, but same basic premise) in another post. Basically it comes down to using a CSV generation/escaping process for the intended users, and hoping the rest don't mind.
Reference spec document.
If you want to make that you said, you can use quotes. Something like this
$name = "Joe Blow, CFA.";
$arr[] = "\"".$name."\"";
so now, you can use comma in your name variable.
You need to quote that values.
Here is a more detailed spec.
In addition to the points in other answers: one thing to note if you are using quotes in Excel is the placement of your spaces. If you have a line of code like this:
print '%s, "%s", "%s", "%s"' % (value_1, value_2, value_3, value_4)
Excel will treat the initial quote as a literal quote instead of using it to escape commas. Your code will need to change to
print '%s,"%s","%s","%s"' % (value_1, value_2, value_3, value_4)
It was this subtlety that brought me here.
You can use Template literals (Template strings)
e.g -
`"${item}"`
CSV files can actually be formatted using different delimiters, comma is just the default.
You can use the sep flag to specify the delimiter you want for your CSV file.
Just add the line sep=; as the very first line in your CSV file, that is if you want your delimiter to be semi-colon. You can change it to any other character.
This isn't a perfect solution, but you can just replace all uses of commas with ‚ or a lower quote. It looks very very similar to a comma and will visually serve the same purpose. No quotes are required
in JS this would be
stringVal.replaceAll(',', '‚')
You will need to be super careful of cases where you need to directly compare that data though
Depending on your language, there may be a to_json method available. That will escape many things that break CSVs.
I faced the same problem and quoting the , did not help. Eventually, I replaced the , with +, finished the processing, saved the output into an outfile and replaced the + with ,. This may seem ugly but it worked for me.
May not be what is needed here but it's a very old question and the answer may help others. A tip I find useful with importing into Excel with a different separator is to open the file in a text editor and add a first line like:
sep=|
where | is the separator you wish Excel to use.
Alternatively you can change the default separator in Windows but a bit long-winded:
Control Panel>Clock & region>Region>Formats>Additional>Numbers>List separator [change from comma to your preferred alternative]. That means Excel will also default to exporting CSVs using the chosen separator.
You could encode your values, for example in PHP base64_encode($str) / base64_decode($str)
IMO this is simpler than doubling up quotes, etc.
https://www.php.net/manual/en/function.base64-encode.php
The encoded values will never contain a comma so every comma in your CSV will be a separator.
You can use the Text_Qualifier field in your Flat file connection manager to as ". This should wrap your data in quotes and only separate by commas which are outside the quotes.
First, if item value has double quote character ("), replace with 2 double quote character ("")
item = item.ToString().Replace("""", """""")
Finally, wrap item value:
ON LEFT: With double quote character (")
ON RIGHT: With double quote character (") and comma character (,)
csv += """" & item.ToString() & ""","
Double quotes not worked for me, it worked for me \". If you want to place a double quotes as example you can set \"\".
You can build formulas, as example:
fprintf(strout, "\"=if(C3=1,\"\"\"\",B3)\"\n");
will write in csv:
=IF(C3=1,"",B3)
A C# method for escaping delimiter characters and quotes in column text. It should be all you need to ensure your csv is not mangled.
private string EscapeDelimiter(string field)
{
if (field.Contains(yourEscapeCharacter))
{
field = field.Replace("\"", "\"\"");
field = $"\"{field}\"";
}
return field;
}