Does Matlab's database toolbox have a function to sanitize inputs? I can't find any mention of one in the documentation.
I have a bunch of strings that I'd like to write to a MySQL database. Some of the strings contain apostrophes, and these are causing errors. I'm looking for a simple way to preprocess the strings to make them database-friendly.
Also, it's not necessary in my application to be able to reconstruct the original strings exactly. The preprocessing step never needs to be "undone".
In the end I used matlab's genvarname function to preprocess my strings. This function doesn't do database sanitization, per se, and it's not invertible, but it does remove apostrophes. It met my needs.
Related
I see similar questions for Javascript and other languages, but I work in T-SQL.
I have a function that removes non-alpha characters from a string and the output is varchar(). I would like a similar function to do the same but output in nvarchar() to prevent implicit conversions when I am dealing with nvarchar() data.
I know I could simply have two functions and call the appropriate one when needed, but for backwards compatibility, it would be nice to have a single function that could check the table being updated, or something along those lines, and output the appropriate varchar() or nvarchar() string. Then I could universally replace all occurrences of this function with the single 'one-size-fits-all' function.
Has anybody ever seen or come up with something like this, or is this simply too much to ask of a function, and I should consider using a stored procedure?
I have a Python script which collects data and sends it to my MySQL table.
I noticed that the "Cost" sometimes is 0,95 which results in 0 in my table since my table use "0.95" instead of "0,95".
I assume the best solution is to convert the , to . in my Python script by using:
variable.replace(",", ".")
However, couldn't one solution be to change format in my MySQL table? So that I store numbers in this format:
1100
0,95
0,1
150000
My Django Model
cost = models.DecimalField(max_digits=10, decimal_places=4, default=None)
Any feedback on how to best solve this issue?
Thanks
Your first instinct is correct: convert the "unusual" (comma-decimal) input into the standard format that MySQL used by default (dot-decimal) at the first point where you receive it.
there's lots of ways to write numbers
Be careful, though that you don't get stung by people using commas as thousands separators like "3,203,907.23", or the European form "3.203.907,23", the Swiss "3'203'907,23' or even this form, which is widely used in India: "32,03,907.71" (yes, I did mean to type only two digits there!)
To make your life easier, the rule for currencies is relatively simple:
where a dot or comma is followed by only two digits at the end of the string, that character is acting as the decimal separator.
Once you know which is the decimal separator, you can safely remove all other non-digits from the string, change the decimal separator you found to . then use any standard library string-to-number conversion.
Storage format isn't presentation format
Yes, you can tell MySQL to use comma as its decimal separator, but doing that will break so much of your code - including the parts of the framework that read from the database and expect dot-decimal numbers - that you'll regret doing it that way very quickly...
There's a general principle at work here: you should do your data storage and processing using a format that is easy to process, interchangeable with other systems, and understood by other software developers.
Consider what happens if you need to allow a different framework to access your MySQL database to generate reports... whoever develops that software (and it may be you) will be glad that the numbers are all stored the way numbers are "always" stored in databases.
Convert on the way in, re-convert on the way out
Where you need to accept input in a different format, convert that input into your standardised format as early as possible.
When you need to use an output format, do the conversion to that format as late as possible.
The idea is to keep as much of your system "unexceptional" as possible. A programmer who has to remember what numeric format will in force at the time when a given method is called is not a happy programmer.
P.S.
The option you're talking about in MySQL is an example of this pattern: it doesn't change how numeric data is stored. All that changes is how you pass numbers to MySQL and how it presents them back to you.
I just found out about placeholders in DBI https://metacpan.org/pod/DBI#Placeholders-and-Bind-Values and it seems to be handling various codes pretty well.
Should I be forcing escape regardless? Are there any scenarios where the placeholders would fail based on the input?
If you escape them and then use bound placeholders, they will end up double escaped, which is not what you want. Just use placeholders. (I frequently use them even when the input is trusted, because it looks cleaner.)
There is rarely a reason to use escaping instead of placeholders. An example would be dynamically generating and manipulating a query as an SQL string, but you really shouldn't do that anyway (there are plenty of libraries on CPAN for generating SQL).
The only example that I know of in which a placeholder would fail based on input that would not fail with string interpretation would be when you are interpolating column names from a string, LIMIT clauses, or some such (but again, that is dynamic generating SQL like I mentioned above.)
Placeholders >> manual escaping
I have an OLE DB Data source and a Flat File Destination in the Data Flow of my SSIS Project. The goal is simply to pump data into a text file, and it does that.
Where I'm having problems is with the formatting. I need to be able to rtrim() a couple of columns to remove trailing spaces, and I have a couple more that need their leading zeros preserved. The current process is losing all the leading zeros.
The rtrim() can be done by simple truncation and ignoring the truncation errors, but that's very inelegant and error prone. I'd like to find a better way, like actually doing the rtrim() function where needed.
Exploring similar SSIS questions & answers on SO, the thing to do seems to be "Use a Script Task", but that's ususally just thrown out there with no details, and it's not at all an intuitive thing to set up.
I don't see how to use scripting to do what I need. Do I use a Script Task on the Control Flow, or a Script Component in the Data Flow? Can I do rtrim() and pad strings where needed in a script? Anybody got an example of doing this or similar things?
Many thanks in advance.
With SSIS, there are many possible solutions! From what you mention, you could use a Derived Column transform within a Data Flow to perform the trimming and padding - you would use an expression to do this, it would be relatively straightforward. Eg,
ltrim([ColumnName])
to trim and something along the lines of
right("0000"+ [ColumnName],6)
to pad (this is off the top of my head so syntax may not be exact).
As for the scripting method, that is also valid. You would use the Script Component Transform on the Data Flow and use VB.NET or C# (if you have 2008) string manipulation methods (eg strVariable.Trim()).
I am using FCKEditor with CakePHP and when I save data sent from the editor I want to run the htmlspecialchars() and mysql_real_escape_string() functions on the data to clean it before I store it in my database. The problem is I am not really sure where to do this within the CakePHP framework. I tried in the controller like this:
function add()
{
if (!empty($this->data))
{
if ($this->Article->save(mysql_real_escape_string(htmlspecialchars($this->data))))
{
$this->Session->setFlash('Your article has been saved.');
$this->redirect(array('action' => 'index'));
}
}
}
However $this->data is an array and those functions expect strings so that won't work. Do I do it in the validate array of the model? I have no idea. Also, let me know if running htmlspecialchars() inside of mysql_real_escape_string() is not a good practice.
Don't use htmlspecialchars() when you save data, use it when you output data to HTML. What if you need to look at the data in some context other than HTML?
Also I'm not a Cake user, but I'd be surprised if you need to apply mysql_real_escape_string() as you save data either. The database access layer should protect you against SQL injection, and by doing it manually you're going to end up storing doubly-escaped strings.
The short and simple answer is - if the database access has been abstracted away, there is no need for you to call these functions at all.
The only place where they are needed is if you build actual SQL from bits of strings. Which you should not do anyway, but that's another story.
Bottom line is - the framework will do the right thing, don't interfere.
EDIT: As Bill Karwin points out - htmlspecialchars() is from the completely wrong department here.
If you are building your own SQL strings with CakePHP, then CakePHP provides the escape function:
escape(string $string, string $connection)
http://book.cakephp.org/view/1186/escape