Unable to insert and save an escaped double quote in MySQL - mysql

I am trying to save JSON values in MySQL, but everytime the value contains double quotes, even if the query is properly written, MySQL manages to removed the escaped \"
For example:
INSERT into JSON_VALUES SET
ID = 150,
RESULT = '[{"ID":"150","VALUE":"THIS IS A \"TEST\" THAT IS IGNORED","DATE":"2021-08-26"}]'
After executing the query the inserted value in MySQL looks like this:
[{
"ID":"150",
"VALUE":"THIS IS A "TEST" THAT IS IGNORED",
"DATE":"2021-08-26"
}]
When "TEST" was supposed to be saved as \"TEST\"
Since TEST is not properly escpaed, the JSON value has a syntax error and becomes unreadable.
How do I force MySQL to preserve escaped content, or more precisely escaped double quotes?

I had the same issue some time ago. I had to use \\\" instead of \". In your case would be:
INSERT into JSON_VALUES SET
ID = 150,
RESULT = '[{"ID":"150","VALUE":"THIS IS A \\\"TEST\\\" THAT IS IGNORED","DATE":"2021-08-26"}]'

Related

Error parsing JSON: more than one document in the input (Redshift to Snowflake SQL)

I'm trying to convert a query from Redshift to Snowflake SQL.
The Redshift query looks like this:
SELECT
cr.creatives as creatives
, JSON_ARRAY_LENGTH(cr.creatives) as creatives_length
, JSON_EXTRACT_PATH_TEXT(JSON_EXTRACT_ARRAY_ELEMENT_TEXT (cr.creatives,0),'previewUrl') as preview_url
FROM campaign_revisions cr
The Snowflake query looks like this:
SELECT
cr.creatives as creatives
, ARRAY_SIZE(TO_ARRAY(ARRAY_CONSTRUCT(cr.creatives))) as creatives_length
, PARSE_JSON(PARSE_JSON(cr.creatives)[0]):previewUrl as preview_url
FROM campaign_revisions cr
It seems like JSON_EXTRACT_PATH_TEXT isn't converted correctly, as the Snowflake query results in error:
Error parsing JSON: more than one document in the input
cr.creatives is formatted like this:
"[{""previewUrl"":""https://someurl.com/preview1.png"",""device"":""desktop"",""splitId"":null,""splitType"":null},{""previewUrl"":""https://someurl.com/preview2.png"",""device"":""mobile"",""splitId"":null,""splitType"":null}]"
It seems to me that you are not working with valid JSON data inside Snowflake.
Please review your file format used for the copy into command.
If you open the "JSON" text provided in a text editor , note that the information is not parsed or formatted as JSON because of the quoting you have. Once your issue with double quotes / escaped quotes is handled, you should be able to make good progress
Proper JSON on Left || Original Data on Right
If you are not inclined to reload your data, see if you can create a Javascript User Defined Function to remove the quotes from your string, then you can use Snowflake to process the variant column.
The following code is working POJO that can be used to remove the doublequotes for you.
var textOriginal = '[{""previewUrl"":""https://someurl.com/preview1.png"",""device"":""desktop"",""splitId"":null,""splitType"":null},{""previewUrl"":""https://someurl.com/preview2.png"",""device"":""mobile"",""splitId"":null,""splitType"":null}]';
function parseText(input){
var a = input.replaceAll('""','\"');
a = JSON.parse(a);
return a;
}
x = parseText(textOriginal);
console.log(x);
For anyone else seeing this double double quote issue in JSON fields coming from CSV files in a Snowflake external stage (slightly different issue than the original question posted):
The issue is likely that you need to use the FIELD_OPTIONALLY_ENCLOSED_BY setting. Specifically, FIELD_OPTIONALLY_ENCLOSED_BY = '"' when setting up your fileformat.
(docs)
Example of creating such a file format:
create or replace file format mydb.myschema.my_tsv_file_format
type = CSV
field_delimiter = '\t'
FIELD_OPTIONALLY_ENCLOSED_BY = '"';
And example of querying from a stage using this file format:
select
$1 field_one
$2 field_two
-- ...and so on
from '#my_s3_stage/path/to/file/my_tab_separated_file.csv' (file_format => 'my_tsv_file_format')

mySql JSON string field returns encoded

First week having to deal with a MYSQL database and JSON field types and I cannot seem to figure out why values are encoded automatically and then returned in encoded format.
Given the following SQL
-- create a multiline string with a tab example
SET #str ="Line One
Line 2 Tabbed out
Line 3";
-- encode it
SET #j = JSON_OBJECT("str", #str);
-- extract the value by name
SET #strOut = JSON_EXTRACT(#J, "$.str");
-- show the object and attribute value.
SELECT #j, #strOut;
You end up with what appears to be a full formed JSON object with a single attribute encoded.
#j = {"str": "Line One\n\tLine 2\tTabbed out\n\tLine 3"}
but using JSON_EXTRACT to get the attribute value I get the encoded version including outer quotes.
#strOut = "Line One\n\tLine 2\tTabbed out\n\tLine 3"
I would expect to get my original string with the \n \t all unescaped to the original values and no outer quotes. as such
Line One
Line 2 Tabbed out
Line 3
I can't seem to find any JSON_DECODE or JSON_UNESCAPE or similar functions.
I did find a JSON_ESCAPE() function but that appears to be used to manually build a JSON object structure in a string.
What am I missing to extract the values to the original format?
I like to use handy operator ->> for this.
It was introduced in MySQL 5.7.13, and basically combines JSON_EXTRACT() and JSON_UNQUOTE():
SET #strOut = #J ->> '$.str';
You are looking for the JSON_UNQUOTE function
SET #strOut = JSON_UNQUOTE( JSON_EXTRACT(#J, "$.str") );
The result of JSON_EXTRACT() is intentionally a JSON document, not a string.
A JSON document may be:
An object enclosed in { }
An array enclosed in [ ]
A scalar string value enclosed in " "
A scalar number or boolean value
A null — but this is not an SQL NULL, it's a JSON null. This leads to confusing cases because you can extract a JSON field whose JSON value is null, and yet in an SQL expression, this fails IS NULL tests, and it also fails to be equal to an SQL string 'null'. Because it's a JSON type, not a scalar type.

Unable to Extract simple Csv file using U-SQL

I have this csv file,
Almost all the records are getting processed fine, however there are two cases in which i am experiencing an issue.
Case 1:
A record containing quotes within quotes:
"some data "some data" some data"
Case 2:
A record containing comma within quotes:
"some data, some data some data"
i have looked into this issue, and got my way around looking into quoting parameter of the extractor, but i have observed that setting (quoting:false) solves case 1 and fails for case 2 and setting (quoting:true) solves case 2 but fails for case 1.
constraints: There is no room for changing the data file, the future data will be tailored accordingly but for this existing data i have to resolve this.
Try this, import records as one row and fix the row text using double quotes (do the same for the commas):
DECLARE #input string = #"/Samples/Data/Sample1.csv";
DECLARE #output string = #"/Output/Sample1.txt";
// Import records as one row
#data =
EXTRACT rowastext string
FROM #input
USING Extractors.Text('\n', quoting: false );
// Fix the row text using double quotes
#query =
SELECT Regex.Replace(rowastext, "([^,])\"([^,])", "$1\"\"$2") AS rowascsv
FROM #data;
OUTPUT #query
TO #output
USING Outputters.Csv(quoting : false);

SQL - Incorrect string value: '\xEF\xBF\xBD' [duplicate]

I have a table I need to handle various characters. The characters include Ø, ® etc.
I have set my table to utf-8 as the default collation, all columns use table default, however when I try to insert these characters I get error: Incorrect string value: '\xEF\xBF\xBD' for column 'buyerName' at row 1
My connection string is defined as
string mySqlConn = "server="+server+";user="+username+";database="+database+";port="+port+";password="+password+";charset=utf8;";
I am at a loss as to why I am still seeing errors. Have I missed anything with either the .net connector, or with my MySQL setup?
--Edit--
My (new) C# insert statement looks like:
MySqlCommand insert = new MySqlCommand( "INSERT INTO fulfilled_Shipments_Data " +
"(amazonOrderId,merchantOrderId,shipmentId,shipmentItemId,"+
"amazonOrderItemId,merchantOrderItemId,purchaseDate,"+ ...
VALUES (#amazonOrderId,#merchantOrderId,#shipmentId,#shipmentItemId,"+
"#amazonOrderItemId,#merchantOrderItemId,#purchaseDate,"+
"paymentsDate,shipmentDate,reportingDate,buyerEmail,buyerName,"+ ...
insert.Parameters.AddWithValue("#amazonorderId",lines[0]);
insert.Parameters.AddWithValue("#merchantOrderId",lines[1]);
insert.Parameters.AddWithValue("#shipmentId",lines[2]);
insert.Parameters.AddWithValue("#shipmentItemId",lines[3]);
insert.Parameters.AddWithValue("#amazonOrderItemId",lines[4]);
insert.Parameters.AddWithValue("#merchantOrderItemId",lines[5]);
insert.Parameters.AddWithValue("#purchaseDate",lines[6]);
insert.Parameters.AddWithValue("#paymentsDate",lines[7]);
insert.ExecuteNonQuery();
Assuming that this is the correct way to use parametrized statements, it is still giving an error
"Incorrect string value: '\xEF\xBF\xBD' for column 'buyerName' at row 1"
Any other ideas?
\xEF\xBF\xBD is the UTF-8 encoding for the unicode character U+FFFD. This is a special character, also known as the "Replacement character". A quote from the wikipedia page about the special unicode characters:
The replacement character � (often a black diamond with a white question mark) is a symbol found in the Unicode standard at codepoint U+FFFD in the Specials table. It is used to indicate problems when a system is not able to decode a stream of data to a correct symbol. It is most commonly seen when a font does not contain a character, but is also seen when the data is invalid and does not match any character:
So it looks like your data source contains corrupted data. It is also possible that you try to read the data using the wrong encoding. Where do the lines come from?
If you can't fix the data, and your input indeed contains invalid characters, you could just remove the replacement characters:
lines[n] = lines[n].Replace("\xFFFD", "");
Mattmanser is right, never write a sql query by concatenating the parameters directly in the query. An example of parametrized query is:
string lastname = "Doe";
double height = 6.1;
DateTime date = new DateTime(1978,4,18);
var connection = new MySqlConnection(connStr);
try
{
connection.Open();
var command = new MySqlCommand(
"SELECT * FROM tblPerson WHERE LastName = #Name AND Height > #Height AND BirthDate < #BirthDate", connection);
command.Parameters.AddWithValue("#Name", lastname);
command.Parameters.AddWithValue("#Height", height);
command.Parameters.AddWithValue("#Name", birthDate);
MySqlDataReader reader = command.ExecuteReader();
...
}
finally
{
connection.Close();
}
To those who have a similar problem using PHP, try the function utf8_encode($string). It just works!
I have this some problem, when my website encoding is utf-u and I tried to send in form CP-1250 string (example taken by listdir dictionaries).
I think you must send string encoded like website.

Incrementing numerical value and changing string in SQL

I have a database that has stored values in a complicated, serialized array where one component is a string and another is the length of the characters of the string, in this format:
s:8:"test.com"
Where "s" holds the character length of the string in the quotations.
I would like to change the string from "test.com" to "testt.com", and I'm using the following statement in SQL:
UPDATE table SET row=(REPLACE (row, 'test.com','testt.com'))
However, this breaks the script in question, because it doesn't update the character length in the "s" preceding the string where "test.com" is stored.
I was wondering if there is a query I can use that would replace the string, and then also increment the value of this "s" preceding to where the replacement occurs, something like this:
UPDATE table SET row=(REPLACE (row, 's:' number 'test.com','s:' number+1 'testt.com'))
Does anyone know if this kind of query is even possible?
UPDATE table set row = concat('s:',length('testt.com'),':"testt.com"');
If you need to change exact string, then use exact query -
UPDATE table SET row = 's:9:"testt.com"' WHERE row = 's:8:"test.com"';
The string is a "serialized string".
If there are multiple strings to be replaced, it might be easier to create a script to handle this.
In PHP, it goes something like this:
$searchfor = serialize('test.com');
$replaceby = serialize('testt.com');
// strip last semicolon from serialized string
$searchfor = trim($searchfor,';');
$replaceby = trim($replaceby,';');
$query = "UPDATE table SET field = '$replaceby' WHERE field = '$searchfor';";
This way, you can create an exact query string with what you need.
Do fill in the proper code for db connection if necessary.