EF Core - Incorrect string value - mysql

I am using EF to update a field in my MySql DB and ran across the issue of attempting to save data that is not allowable due to collation. For example, ^âÂêÊîÎôÔûÛŵŷ has characters outside the column with character set of latin1.
Running an update/insert with above example I get the exception:
The database update did not take place due to..Incorrect string value
I know what the problem is, but I don't want to keep the characters, the data being provided is usually via UI which would often control what is passed in, however it is also callable by API allowing whatever data the caller would like to send. In the above case, I would like to drop those characters or just replace with a question mark, basically ignore them.
This system already exists in an older language and the rule to (silently..) ignore them exists, I need the error not to be raised and for it to save what it can. I have seen how I can modify the statements for this, or how I can modify the string data coming in. I have 1000s of these. Is there another method to achieve this?

Related

Best methods to avoid MySQL 1406 errors on VARCHAR

My server is using a MySQL DB, connecting to it via the C++ connector. I'm nearing production and I've been spending some time trying to break things as part of hardening the server.
One action item I had was to see what would happen if I execute a statement with a string that is longer than VARCHAR. For example, if I have a column defined as VARCHAR(4) and then set it to the string "hello".
This of course throws an exception with the error code 1406 (Data too long for column).
What I was wondering was if there was a good or standard way to defend against this? Obviously one thing is to check against the string length and truncate manually. I can do this, however there are many tables and several columns with VARCHAR. So my worry is updating server code if one of the columns using VARCHAR has its length increased (i.e. code maintainability)
Note that the server does do some validation up front. I'm just trying to defend against a subtle bug or corner case that lets something slip through.
A couple of other options on the table are to disable strict so it will give a warning and truncate or to convert VARCHAR to TEXT.
I was wondering a few things.
Is there a recommended method to handle this situation?
What are the disadvantages of disabling strict?
Is it worth (and is it possible) to query the DB at runtime the VARCHAR lengths? Note that I'm using the C++ connector. I suppose I could also write a tool that is run before compiling which would extract out VARCHAR lengths from the SQL code used to generate tables. But that then makes me wonder is I'm over engineering this.
I'm just sorting through the possible approaches now and thought I'd seek advice from those with more experience with MySQL.
As an experience database engineer I would recommend a combination of the follow two strategies:
1) If you that know that a there is a chance, however small, that data for your varchar(4) could go higher than 4 then make the varchar field larger than 4. For example, if you expect that the field can go as high as 8 then set the field to varchar(10). The beauty of using a varchar field instead of a char is that a varchar will only use whatever storage it needs.
2) If there is a real issue with data constantly being larger than the varchar field length then you should right your own exception handler to trap for the 1406 error. For the exception to work properly you will need to come up with some type of strategy on exactly how you want to handle the exception. For example, you could send an error to the user and ask them to fix the problem, you could accept the data but truncated it so it fits into the field, or you could send the error to a log file to get fixed at a later time.

Reading Encrypted data with Datastage Tool

Actually i need Your help in datastage 11.7 tool. i am reading a AES encrypted column from my source and type of column is nvarchar so when we start our job and read data from source. The job run Successfully and exactly same data is moved to my target data base with same column type.
And the Problem Actually occur is that when i query the data to check whether the my source and target values are same, the query does not show any result and visually if we look source,target value they are same value but sql statement return nothing and the database is Vertica.
Column value are special Alpha numeric and special characters like �D�&7��x��d$�Q
I'm not at all sure this is even properly possible via datastage - treated encrypted data and a varchar. Some DB's have internal keys that go with the data that require decrypting before extracting. I'm assuming that decrypting, transporting, landing and then encrypting is not an option.
But if I had to take a stab in the dark.
The very first thing I'd check is that the character set and collation is the same on both databases on a table level. A difference can result in blank results on the target side.
Also check that the NLS map in the datastage (map for stages and collation locale) is set accordingly. What that settings is, I don't know but making it the same in DataSTage and the DBs would be ideal ; Google. You need to comment on what is already set in the DB's. And run tests. I'm not sure the DataStage default of ISO-8859-1 will work.
Please post your solution if you find one.

Data Cleanse ENTIRE Access Table of Specific Value (SQL Update Query Issues)

I've been searching for a quick way to do this after my first few thoughts have failed me, but I haven't found anything.
My Issue
I'm importing raw client data into an Access database where the flat file they provide is parsed and converted into a standardized format for our organization. I do this for all of our clients, but this particular client's software gives us a file that puts "(NULL)" in every field that should be NULL. lol as a result, I have a ton of strings rather than a null field!
My goal is to do a data cleanse of the entire TABLE, rather than perform the cleanse at the FIELD level (as I do in my temporary solution below).
Data Cleanse
Temporary Solution:
I can't add those strings to our datawarehouse, so for now, I just have a query with an IIF statement check that replaces "(NULL)" with "" for each field (which took awhile to setup since the client file has roughly 96 fields). This works. However, we work with hundreds of clients, so I'd like to make a scale-able solution that doesn't require many changes if another client has a similar file; not to mention that if this client changes something in their file, I might have to redo my field specific statements.
Long-term Solution:
My first thought was an UPDATE query. I was hoping I could do something like:
UPDATE [ImportedRaw_T]
SET [ImportedRaw_T].* = ""
WHERE ((([ImportedRaw_T].* = "(NULL)"));
This would be easily scale-able, since for further clients I'd only need to change the table name and replace "(NULL)" with their particular default. Unfortunately, you can't use SELECT * with an update query.
Can anyone think of a work-around to the SELECT * issue for the update query, or have a better solution for cleansing an entire table, rather doing the cleanse at the field level?
SIDE NOTES
This conversion is 100% automated currently (Access is called via a watch folder batch), so anything requiring manual data manipulation / human intervention is out.
I've tried using a batch script to just cleanse the data in the .txt file before importing to Access - however, this caused an issue with the fixed-width format of the .txt, which has caused even larger issues with the automatic import of the file to Access. So I'd prefer to do this in Access if possible.
Any thoughts and suggestions are greatly appreciated. Thanks!
Unfortunately it's impossible to implement this in SQL using wildcards instead of column names, there is no such kind syntax.
I would suggest VBA solution, where you need to cycle thru all table fields and if field data type is string, generate and execute SQL UPDATE command for updating current field.
Also use Null instead of "", if you really need Nulls in the field instead of empty strings, they may work differently in calculations.

Pictures using Postgres and Xojo

I have converted from a MySQL database to Postgres. During the conversion, the picture column in Postgres was created as bytea.
This Xojo code works in MySQL but not Postgres.
Dim mImage as Picture
mImage = rs.Field("Picture").PictureValue
Any ideas?
I don't know about this particular issue, but here's what you can do to find out yourself, perhaps:
Pictures are stored as BLOBs in the database. Now, this means that the column must also be declared as BLOB (or a similar binary type). If it was accidentally marked as TEXT, this would work as long as the database does not get exported by other means. I.e, as long as only your Xojo code reads and writes to the record, using the PictureValue functions, that takes care of keeping the data in BLOB form. But if you'd then convert to another database, the BLOB data would be read as text, and in that process it might get mangled.
So, it may be relevant to let us know how you converted the DB. Did you perform a export as SQL commands and then imported it into Postgres by running these commands again? Do you still have the export file? If so, find a record with picture data in it and see if that data is starting with: x' and then contains hex byte code, e.g. x'45FE1200... and so on. If it doesn't, that's another indicator for my suspicion.
So, check the type of the Picture column in your old DB first. If that specifies a binary data type, then the above probably does not apply.
Next, you can look at the actualy binary data that Xojo reads. To do that, get the BlobValue instead of the PictureValue, and store that in a MemoryBlock. Do the same for a single picture, both with the old and the new database. The memoryblock should contain the same bytes. If not, that would suggest that the data was not transferred correctly. Why? Well, that depends on how you converted it.

Restrict varchar() to range a-z

I'm working on implementing and designing my first database and have a lot of columns with names and addresses and the like.
It seems logical to place a CHECK constraint on these columns so that the DB only accepts values from an alphanumeric range (disallowing any special characters).
I am using MySQL which, as far as I can tell doesn't support user defined types, is there an easy way to do this?
It seems worth while to prevent bad data from entering the DB, but should this complex checking be offloaded to the application instead?
You can't do it with a CHECK constraint if you're using mysql (question is tagged wth mysql, so I presume this is the case) - mysql doesn't support check constraints. They are allowed in the syntax (to be compatible with DDL from other databases), but are otherwise ignored.
You could add a trigger to the table that fires on insert and update, that checks the data for compliance, but if you find a problem there's no way to raise an exception from a mysql stored proc.
I have used a workaround of hitting a table that doesn't exist, but has a name that conveys the meaning you want, eg
update invalid_characters set col1 = 1;
and hope that the person reading the "table invalid_characters does not exist" message gets the idea.
There are several settings that allows you to change how MySQL handles certain situation (but those aren't enough) for your case.
I would stick with data validation on application side but if you need validation on database side, you have two options:
CREATE PROCEDURE that would validate and insert data, do nothing or raise error by calling SIGNAL
CREATE TRIGGER ... BEFORE INSERT which would validate data and stop insert like suggested in this stackoverflow answer