Postgres row_to_json produces invalid JSON with double escaped quotes - json

Postgres escapes quotes incorrectly when creating a JSON export. Note the double quotes in the below update...
UPDATE models SET column='"hello"' WHERE id=1;
COPY (SELECT row_to_json(models)
FROM (SELECT column FROM shaders WHERE id=1) shaders)
TO '/output.json';
The contents of output.json:
{"column":"\\"hello\\""}
You can see that the quotes are escaped improperly and it creates invalid JSON.
It should be:
{"column":"\"hello\""}
How can I fix this Postgres bug or work around it?

This is not JSON related. It's about the way text format (default) in COPY command handles backslashes. From the PostgreSQL documentation - COPY:
Backslash characters (\) can be used in the COPY data to quote data characters that might otherwise be taken as row or column delimiters. In particular, the following characters must be preceded by a backslash if they appear as part of a column value: backslash itself, newline, carriage return, and the current delimiter character.
(Emphasis mine.)
You can solve it by using CSV-format and changing the quote character from doublequote to something else.
To demonstrate:
SELECT row_to_json(row('"hello"'))
| "{"f1":"\"hello\""}" |
COPY (SELECT row_to_json(row('"hello"'))) TO '/output.json';
| {"f1":"\\"hello\\""} |
COPY (SELECT row_to_json(row('"hello"'))) TO '/output.json' CSV QUOTE '$';
| {"f1":"\"hello\""} |

The answer by Simo Kivistö works if you are certain that the character $, or whatever the special quote character you chose does not appear in your strings. In my case, I had to export a very large table and there was no particular character which didn't appear in the strings.
To work around this issue, I piped the output of the COPY command to sed to revert the double escaping of quotes:
psql -c "COPY (SELECT row_to_json(t) from my_table as t) to STDOUT;" |
sed 's/\\"/\"/g' > my_table.json
The sed expression I am piping to simply replaces occurrences of \\" with \".

Related

Escape single backslash in MySQL JSON INSERT

I'm trying to encode a regex in a JSON data field in a MySQL database.
The regex is as follows: ^\d*[13579]$ and should look the same, if I try to read it afterwards.
AFAIK, for single backslash escaping in SQL I need double backslashes.
However, when is replace the single backslash with two like this:
^\\d*[13579]$, I get an error stating:
Invalid JSON text: "Invalid escape character in string." and my IDE also shows it as an error. When I use another set of two backslashes, the error disappears, but I also get two backslashes in the final string.
Any idea, what the problem might be?
Thanks!
The double-backslash is correct for JSON.
JSON has its own escape sequences, similar to the escape sequences in regular expressions. In JSON, \b means backspace, \n means newline, \t means tab, and so on. If you want to store a literal backslash character, use \\. Otherwise the backslash must be followed by one of the recognized escape sequences.
If you store a literal backslash character in a JSON value, it must be a double backslash. If you extract that JSON value and "unquote" it, it will be returned as a single backslash as you intended.
Demo:
mysql> create table t ( j json );
mysql> insert into t set j = '["^\\\\d*[13579]$"]';
mysql> select j from t;
+-------------------+
| j |
+-------------------+
| ["^\\d*[13579]$"] |
+-------------------+
mysql> select j->>'$[0]' from t;
+--------------+
| j->>'$[0]' |
+--------------+
| ^\d*[13579]$ |
+--------------+

MySQL- Insert single escape character into MySQL JSON field

In dealing with the headache of the different rulesets with TEXT escaping and JSON escaping, I've come across the issue where double escaping is required to convert a string to a JSON literal. For example, the original UPDATE looks like this:
UPDATE sourcing_item_data SET data_JSON='{"test": "test \ test"}' WHERE ID = 1;
The above simply removes the '\'.
The problem is I can't see how we get a single backslash into the system. Using two \'s causes the Invalid JSON error. Using three \'s does the same. Using four \'s puts in two \'s.
How does one get a single backslash into a JSON literal from a string with MySQL?
Also, has anyone written a SP or Function that scans a string that's supposed to be converted to MySQL JSON to ensure the string is "scrubbed" for issues (such as this one)?
Thanks!
Four backslashes works.
UPDATE sourcing_item_data SET data_JSON='{"test": "test \\\\ test"}' WHERE ID = 1;
You need to double the backslash to escape it in JSON, and then double each of those to escape in the SQL string.
If you print the JSON value it will show up as two backslashes, but that's because it shows the value in JSON format, which means that the backslash has to be escaped. If you extract the value and unquote it, there will just be one backslash.
select data_JSON->>"$.test" as result
from sourcing_item_data
WHERE id = 1;
shows test \ test
DEMO

Are there any scope that escape all special characters in mysql query?

I have a set of queries with randoms data that i want to insert in database. Randoms data may have any special characters.
for example:
INSERT INTO tablename VALUES('!^'"twco\dq');
Are there any scope that escape all special characters?
please help.
No, there is no "scope" in MySQL to automatically escape all special characters.
If you have a text file containing statements that were created with potentially unsafe "random values" like this:
INSERT INTO tablename VALUES('!^'"twco\dq');
^^^^^^^^^^^
You're basically screwed. MySQL can't unscramble a scrambled egg. There's no "mode" that makes MySQL work with a statement like that.
Fortunately, that particular statement will throw an error. More tragic would be some nefariously random data,
x'); DROP TABLE students; --
if that random string got incorporated into your SQL text without being escaped, the result would be:
INSERT INTO tablename VALUES('x'); DROP TABLE students; --');
The escaping of special characters has to be done before the values are incorporated into SQL text.
You'd need to take your random string value:
!^'"twco\dq
And run it through a function that performs the necessary escaping to make that value safe for including that as part of the the SQL statement.
MySQL provides the real_escape_string_function as part of their C library. Reference https://dev.mysql.com/doc/refman/5.5/en/mysql-real-escape-string.html. This same functionality is exposed through the MySQL Connectors for several languages.
An even better pattern that "escaping" is to use prepared statements with bind placeholders, so your statement would be a static literal, like this:
INSERT INTO tablename VALUES ( ? )
You can use \ character to escape special characters like below. See this DEMO if in doubt.
INSERT INTO tablename VALUES('\!\^\'\"twco\\dq');
Per MySQL documentation, below are the defined escape sequences
Table 9.1 Special Character Escape Sequences
\0 An ASCII NUL (0x00) character.
\' A single quote (“'”) character.
\" A double quote (“"”) character.
\b A backspace character.
\n A newline (linefeed) character.
\r A carriage return character.
\t A tab character.
\Z ASCII 26 (Control+Z). See note following the table.
\\ A backslash (“\”) character.
\% A “%” character. See note following the table.
\_ A “_” character. See note following the table.

How to use UPDATE in MySQL with string containing escape characters

please look here:
UPDATE cars_tbl
SET description = '{\rtf1'
WHERE (ID=1)
Description field is "blob", where my RTF document is to be stored.
When I check updated data I always find
{
tf1
\r simply disapears. I tried to find solution on the web, but no success. My rtf files are corrupted on many places, because the escape characters used in the string are substituted. How to suppress this substitution and update field with string as is?
Thanx for advice
Lyborko
Backslash is an escape character, so to keep it you need a double backslash:
UPDATE cars_tbl
SET description = '{\\rtf1'
WHERE (ID=1)
As an aside \r is a carriage return.. and it hasn't disappeared in your data; it is responsible for tf1 appearing on the line below the {.
You can achieve this with a more generic approach
use of QUOTE() in mysql
MySQL QUOTE() produces a string which is a properly escaped data value in an SQL statement, out of an user supplied string as argument.
The function achieve this by enclosing the string with single quotes, and by preceding each single quote, backslash, ASCII NUL and control-Z with a backslash.
example
UPDATE cars_tbl
SET description = QUOTE('{\rtf1')
WHERE (ID=1)
UPDATE
to escape your RTF you can also just use REPLACE this way all your \ will become \\
Example
UPDATE cars_tbl
SET description = REPLACE('{\rtf1', '\', '\\')
WHERE (ID=1)

Select special characteres mysql

I need to make selects from fields that can contain special characteres for example
+--------------+
| code |
+--------------+
| **4058947"_\ |
| **4123/"_\ |
| sew'-8947"_\ |
+--------------+
i try this
select code from table where code REGEXP '[(|**4058947"_\|)]';
select code from table where code REGEXP '[(**4058947"_\)]';
select code from table where code REGEXP '^[(**4058947"_\)]';
but the querys return all rows and this query return empty
select code from table where code REGEXP '^[(**4058947"_\)]$';
and i need that only return the first one or the specified
To select only one row, you could just do this if it doesn't matter which one.
SELECT code FROM table LIMIT 1
If it does matter, drop the regex.
SELECT code FROM table WHERE code = "**4058947\"_\\"
To match those special characters (in this case, " and \), you need to "escape" them. (That's how it's called. I didn't make that up.) In most mainstream languages this is done by putting a backslash in front of it (MySQL does it this way too). The backslash is the escape character, a backslash with another character behind it is called an escape sequence. As you see, I escaped the quote and the backslash in the code value I want to match, so it should work now.
If you need to keep the regexes (which I hope is not the case, since you have the literal string you want to match against) same thing applies. Escape quotes and backslashes and you'll be fine, if you drop the parentheses and brackets. Note that in a regex, you need to escape far more characters. This is because some characters (for example: | [] () * + have a special function in a regex. This is very handy, but becomes a bit of a problem when you need to match a string with that character in it. In that case, you need to escape it, but with a double backslash! This is because MySQL first parses the query and will throw an error if it encounters an invalid escape sequence (that is, if you escape a character you needn't escape according to MySQL). Only then is the result parsed as a regex, with the double backslashes replaced by single backslashes. This gets ugly very quickly, since this means matching a backslash with a MySQL regex requires 4 backslashes! Two in the regex, but this needs to be doubled, since MySQL parses it as a string first!