Should escaped strings be "unescaped" (MySQL, Node.js)? - mysql

I am working with the node driver for MySQL by felixge. Following the documentation
https://github.com/felixge/node-mysql#escaping-query-values
strings should be escaped before passed into the MySQL to avoid injection attacs.
My question is: should data be "unescaped" in some way after loaded from MySQL?
Currently, I have a problem with data integrity: I start with a string containing newlines. (printing with console.log(string) shows newlines in the console). After escaping the string, it is saved into a MySQL database. However, after the string is loaded back into memory, a console.log(string) shows escape codes \n instead of newlines.

before passed into the MySQL to avoid injection attacks.
This statement is wrong.
First, strings should be escaped because of syntax rules, not whatever injections.
Second, I hope they have some recipe for the non-strings too.
Should escaped strings be “unescaped” (MySQL)
No.
Escaping is for the query, not database.
shows escape codes \n instead of newlines.
you are escaping your strings twice then

Related

mysql - storing json, double quote escape issue

I've been chasing a JSON bug about and discovered that I'm getting different results if I load a file from disk as to loading the (apparently) same file from the db.
Mysql seems to be stealing my escape characters. (I'm using vbscript; my connect string is Driver={MySQL ODBC 5.1 Driver};Server=localhost;Database=foo;User=foo;Password=f00;Option=3;)
On doing a conn.execute(...)
update courses set config = '{"set": "value in \"here\" ok?"}' where id = 21;
select config from courses where id = 21;
// prints changed value {"set": "value in "here" ok?"}
What's going on here? Why is mysql taking out my \" and turning them into "?
If I use workbench on the server (windows 2003) and use the feature "load value from file" in the results pane, I can import the json to the field and it retains the proper escape sequence values. But doing an update / insert, the escape sequence characters are lost.
MySQL does escaping as well. If you need escape characters back when retrieving data, you'll need to escape the escape characters. E.g. "\\".
The issue will solve when you use \\".
But why you have to add an extra \ ??
PHP will treat \" as escape sequence and replace it with "and your final query will be like '{"set": "value in "here" ok?"}'.
By adding extra \ you can build '{"set": "value in \"here\" ok?"}' (PHP will replace \\ with \)

Loading data into MySQL: How to deal with backslashes?

I downloaded a tab-delimited file from a well-known source and now want to upload it into a MySQL table. I am doing this using load data local infile.
This data file, which has over 10 million records, also has the misfortune of many backslashes.
$ grep '\\' tabd_file.txt | wc -l
223212
These backslashes aren't a problem, except when they come at the end of fields. MySQL interprets backslashes as an escape character, and when it comes at the end of the field, it messes up the next field, or possibly the next row.
In spite of these backslashes, I only received 6 warnings from MySQL when loading it into a table. In each of these warnings, a row doesn't have the proper number of columns precisely because the backslash concatenated two adjacent fields in the same row.
My question is, how to deal with these backslashes? Should I specify load data local infile [...] escaped by '' to remove any special meaning from them? Or would this have unintended consequences? I can't think of a single important use of an escape sequence in this data file. The actual tabs that terminate fields are "physical tabs", not "\t" sequences.
Or, is removing the escape character from my load command bad practice? Should I just replace every instance of '\' in the file with '\\'?
Thanks for any advice :-)
If you don't need the escaping, then definitely use ESCAPED BY ''.
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
"If the FIELDS ESCAPED BY character is empty, escape-sequence interpretation does not occur. "

Using Excel to create a CSV file with special characters and then Importing it into a db using SSIS

Take this XLS file
I then save this XLS file as CSV and then open it up with a text editor. This is what I see:
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB""C","D,E",F,03,"3,2"
I see that the double quote character in column C was stored as AB""C, the column value was enclosed with quotations and the double quote character in the data was replaced with 2 double quote characters to indicate that the quote is occurring within the data and not terminating the column value. I also see that the value for column G, 3,2, is enclosed in quotes so that it is clear that the comma occurs within the data rather than indicating a new column. So far, so good.
I am a little surprised that all of the column values are not enclosed by quotes but even this seems reasonable OK when I assume that EXCEL only specifies column delimieters when special characters like a commad or a dbl quote character exists in the data.
Now I try to use SQL Server to import the csv file. Note that I specify a double quote character as the Text Qualifier character.
And a command char as the Column delimiter character. However, note that SSIS imports column 3 incorrectly,eg, not translating the two consecutive double quote characters as a single occurence of a double quote character.
What do I have to do to get Excel and SSIS to get along?
Generally people avoid the issue by using column delimiter chactacters that are LESS LIKELY to occur in the data but this is not a real solution.
I find that if I modify the file from this
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB""C","D,E",F,03,"3,2"
...to this:
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB"C","D,E",F,03,"3,2"
i.e, removing the two consecutive quotes in column C's value, that the data is loaded properly, however, this is a little confusing to me. First of all, how does SSIS determine that the double quote between the B and the C is not terminating that column value? Is it because the following characters are not a comma column delimiter or a row delimiter (CRLF)? And why does Excel export it this way?
According to Wikipedia, here are a couple of traits of a CSV file:
Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes. For example:
"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
However, it looks like SSIS doesn't like it that way when importing. What can be done to get Excel to create a CSV file that could contain ANY special characters used as column delimiters, text delimiters or row delimiters in the data? There's no reason that it can't work using the approach specified in Wikipedia,. which is what I thought the old MS DTS packages used to do...
Update:
If I use Notepad change the input file to
Col1,Col2,Col3,Col4,Col5,Col6,Col7,Col8
"1","ABC","AB""C","D,E","F","03","3,2","AB""C"
Excel reads it just fine
but SSIS returns
The preview sample contains embedded text qualifiers ("). The flat file parser does not support embedding text qualifiers in data. Parsing columns that contain data with text qualifiers will fail at run time.
Conclusion:
Just like the error message says in your update...
The flat file parser does not support embedding text qualifiers in data. Parsing columns that contain data with text qualifiers will fail at run time.
Confirmed bug in Microsoft Connect. I encourage everyone reading this to click on this aforementioned link and place your vote to have them fix this stinker. This is in the top 10 of the most egregious bugs I have encountered.
Do you need to use a comma delimiter.
I used a pipe delimiter with no Text qualifier and it worked fine. Here is my output form the text file.
1|ABC|AB"C|D,E|F|03|3,2
You have 3 options in my opinion.
Read the data into a stage table.
Run any update queries you need on the columns
Now select your data from the stage table and output it to a flat file.
OR
Use pipes are you delimiters.
OR
Do all of this in a C# application and build it in code.
You could send the row to a script in SSIS and parse and build the file you want there as well.
Using text qualifiers and "character" delimited fields is problematic for sure.
Have Fun!

Openoffice - CSV-export: is there a default escape-charcter?

As far as I can see OpenOffice, when it comes to save a file as a csv-file, encloses all strings in quote-characters.
So is there any need for an escape character?
and related to this question:
Does OpenOffice have a default escape character?
I'm also wondering if there is a way to choose the escape character when saving OpenOffice as csv. phpmyadmin was not accepting a 9,000 line 50+ column spreadsheed in .ods format and there doesn't seem to be a way to choose the escape character when saving as CSV.
So I had to save as csv, open in word, and use some find/replace tricks to change the escape character to \ (back slash). Default is to use double quotes to escape double quotes, and phpmyadmin won't accept that format.
To properly convert the file to use \ (back-slash) to escape double-quotes, you have to do this:
Pick a placeholder character string, e.g. 'abcdefg', that does
not occur anywhere in the csv.
Find/replace """ (three double-quotes in a row) with the placeholder. This is to prevent possibly incorrect results in the next step.
Find/replace "" (two quotes in a row, representing one quote that should be escaped), with \" (back-slash double-quote). If you did this without find/replacing """ it's conceivable you could get a result like "\" instead of \"". Better safe than sorry.
Find/replace the placeholder string with \"" (back-slash double-quote double-quote).
That will work, unless you happen to have more than one double-quote in a row in your original text fields, which would result in as many as five double-quotes in a row in the resulting .ods or .xlsx csv file (two double-quotes for each escaped double quote, plus another double quote if its at the end of the field).
Escaping in quotes makes life easier for tools parsing the CSV file.
In a recent version of LibreOffice (3.4.4), the CSV export was not handled correctly by phpMyAdmin. Since LibreOffice doesn't provide an escape character, the phpMyAdmin's default "CSV" import feature "Columns escaped with:" didn't work well. The data was always inconsistent.
However, using the option CSV using LOAD DATA did work, only if the value in Columns escaped by option was removed. I presume phpMyAdmin uses the default MySQL LOAD DATA command, and thus the control is passed to MySQL for data processing. In my scenario it resulted in accurate data import.

mysqlimport and double-quotes

We have a large tab-delimited text file (approximately 120,000 records, 50MB) that we're trying to shove into MySQL using mysqlimport. Some fields are enclosed in double-quotes, some not. We're using the fields-optionally-enclosed-by='\"' switch, but the problem is some of the field values themselves contain double-quotes (indicating inches) so the delimited field value might be something "ABCDEF19"". Make sense?
We have no control over the source of the file, so we can't change the formatting there. I tried removing the fields-optionally-enclosed-by switch, but then the double-quotes that surround the values are imported.
he records with quotes in the values are getting seriously messed up. Is there a way we can tell mysqlimport that some fields are optionally enclosed by quotes, but may still contain quotes? We've thought maybe a global search and replace to escape the double-quotes in field values? Or any other suggestions?
If your data is including quotes inside of the body of the field quote without delimiting that somehow, you have a problem. You can't guarantee that mysqlimport will do this properly.
Massage the data first before trying to insert it in this way.
Luckily, it is tab-delimited, so you can run a regex to replace the quotes with a delimited version and then tell mysqlimport the delimiter.
You could import it with the quotes (fields-optionally-enclosed-by switch removed) and then run a check where if the value has double quotes at the beginning and end (assuming none of the values have inches at the beginning) then truncate by 1 character from the beginning and end to remove the extra quotes you got from importing.
EDIT: after reading kekoav's response I have to agree that if you are able to manipulate the file before importing that would be a much wiser option, but if you are forced to remove quotes afterwards, you could use something like this:
UPDATE table
SET column =
IF(
STRCMP(LEFT(table.column,1),'"'),
MID(table.column,2,(LENGTH(table.column)-2)),
table.column
)
for every 'column' in 'table'