how to convert values from database to be numeric? - mysql

I'm grabbing all the values of a column using:
> myValues <- dbGetQuery(mydb,"select average_Medicare_allowed_amt from STAGING_MEDICAREPUF")
because the values are defined as varchar, when I do a summary(myValues) r is not recognizing that the values are numerical:
Assuming I have no access to the backend schema, and am unable to cast the varchars to decimals, is it possible to first convert myValues to be numerical and then get a summary?

In MySQL, I find that the easiest way to convert to a number value is to simply add zero:
select (average_Medicare_allowed_amt + 0) as average_Medicare_allowed_amt
Note that the use of the column alias. This allows you to refer to the resulting value using the same name.
MySQL does "silent" conversion. If it encounters an error or a non-numeric character, then the conversion stops. So, 'abc' + 0 returns 0 instead of generating an error.
And, regarding your comment, I have never heard of "cast()" permissions in any database.

Related

Reading negative numbers in a column

I'm using SSIS to separate good data from unusable date. In order to do that I used derived columns, script task and conditional split where I assigned certain conditions. One of the conditions I need to apply is that none of the numbers in one column cannot be negative. I'm guessing that the best way to solve this would be using conditional split, but I cannot get it to work. I'm new to SSIS, so any help would be appreciated.
You'd have an Expression like
[MyCaseSensitiveColumnName] < 0
and then name the output path something like BadData_NegativeValue
From the comments
that is what I did before, but I'm getting an error saying that The data types "DT_WSTR" and "DT_I4" are incompatible for binary operator ">"
That error message indicates that you are attempting to compare a unicode string (DT_WSTR) and an integer (DT_I4) and that the expression language does not allow it.
To resolve this type incompatibility, you would need to first convert the value of MyCaseSensitiveColumnName from DT_WSTR to an integer.
I'd likely add a Derived Column Component to my data flow and create a new column called MyCaseSensitiveColumnNameAsInteger with an expression like
(DT_I4) [MyCaseSensitiveColumnName]
Now, that may be perilous depending on the quality of your source data. I don't know why you are pulling numeric data in as a string. If there could be non whole numbers in the data set, then we will need to check before making the cast. If there are NULLs in that dataset, those too may cause issues.
That would result in our conditional split check becoming
[MyCaseSensitiveColumnNameAsInteger] < 0

Select statement returns data although given value in the where clause is false

I have a table on my MySQL db named membertable. The table consists of two fields which are memberid and membername. The memberid field has the type of integer and uses auto_increment function starting from 2001. The membername table has the type of varchar.
The membertable has two records with the same order as described above. The records look like this :
memberid : 2001
membername : john smith
memberid : 2002
membername : will smith
I found something weird when I ran a SELECT statement against the memberid field. Running the following statement :
SELECT * FROM `membertable` WHERE `memberid` = '2001somecharacter'
It returned the first data.
Why did that happen? There's no record with memberid = 2001somecharacter. It looks like MySQL only search the first 4 character (2001) and when It's found related data, which is the returned data above, it denies the remaining characters.
How could this happen? And is there any way to turn off this behavior?
--
membertable uses innodb engine
This happens because mysql tries to convert "2001somecharacter" into a number which returns 2001.
Since you're comparing a number to a string, you should use
SELECT * FROM `membertable` WHERE CONVERT(`memberid`,CHAR) = '2001somecharacter';
to avoid this behavior.
OR to do it properly, is NOT put your search variable in quotes so that it has to be a number otherwise it'll blow up because of syntax error and then in front end making sure it's a number before passing in the query.
sqlfiddle
Your finding is an expexted MySQL behaviour.
MySQL converts a varchar to an integer starting from the beginning. As long as there are numeric characters wich can easily be converted, they are icluded in the conversion process. If there's a letter, the conversion stops returning the integer value of the numeric string read so far...
Here's some description of this behavior on the MySQL documentation Site. Unfortunately, it's not mentioned directly in the text, but there's an example which exactly shows this behaviour.
MySQL is very liberal in converting string values to numeric values when evaluated in numeric context.
As a demonstration, adding 0 causes the string to evaluated in a numeric context:
SELECT '2001foo' + 0 --> 2001
, '01.2-3E' + 0 --> 1.2
, 'abc567g' + 0 --> 0
When a string is evaluated in a numeric context, MySQL reads the string character by character, until it encounters a character where the string can no longer be interpreted as a numeric value, or until it reaches the end of the string.
I don't know of a way to "turn off" or disable this behavior. (There may be a setting of sql_mode that changes this behavior, but likely that change will impact other SQL statements that are working, which may stop working if that change is made.
Typically, this kind of check of the arguments is done in the application.
But if you need to do this in the SELECT statement, one option would be cast/convert the column as a character string, and then do the comparison.
But that can have some significant performance consequences. If we do a cast or convert (or any function) on a column that's in a condition in the WHERE clause, MySQL will not be able to use a range scan operation on a suitable index. We're forcing MySQL to perform the cast/convert operation on every row in the table, and compare the result to the literal.
So, that's not the best pattern.
If I needed to perform a check like that within the SQL statement, I would do something like this:
WHERE t.memberid = '2001foo' + 0
AND CAST('2001foo' + 0 AS CHAR) = '2001foo'
The first line is doing the same thing as the current query. And that can take advantage of a suitable index.
The second condition is converting the same value to a numeric, then casting that back to character, and then comparing the result to the original. With the values shown here, it will evaluate to FALSE, and the query will not return any rows.
This will also not return a row if the string value has a leading space, ' 2001'. The second condition is going to evaluate as FALSE.
When comparing an INT to a 'string', the string is converted to a number.
Converting a string to a number takes as many of the leading characters as it can and still be a number. So '2001character' is treated as the number 2001.
If you want non-numeric characters in member_id, make it VARCHAR.
If you want only numeric ids, then reject '200.1character'

mysql SUM of VARCHAR fields without using CAST

When SUM is used in query on field of type VARCHAR in MySql database, does SUM automatically convert it into number ?
I tried this by using
SELECT SUM(parametervalue) FROM table
and it reveals that MySql returns the sum although I expected to throw it an error as "parametervalue" field is of VARCHAR type
MySQL does silent conversion for a string in a numeric context. Because it expects a number for the sum(), MySQL simply does the conversion using the leading "numbers" from a string. Note that this include decimal points, minus sign, and even e representing scientific notation. So, '1e6' is interpreted as a number.
In code, I personally would make the conversion explicit by adding 0:
SELECT SUM(parametervalue + 0) FROM table
Ironically, the cast() might return an error if the string is not in a numeric format, but this doesn't return an error in that case.

Force mySQL queries to be characters not numeric in R

I'm using RODBC to interface R with a MySQL database and have encountered a problem. I need to join two tables based on unique ID numbers (IDNUM below). The issue is that the ID numbers are 20 digit integers and R wants to round them. OK, no problem, I'll just pull these IDs as character strings instead of numeric using CAST(blah AS CHAR).
But R sees the incoming character strings as numbers and thinks "hey, I know these are character strings... but these character strings are just numbers, so I'm pretty sure this guy wants me to store this as numeric, let me fix that for him" then converts them back into numeric and rounds them. I need to force R to take the input as given and can't figure out how to make this happen.
Here's the code I'm using (Interval is a vector that contains a beginning and an ending timestamp, so this code is meant to only pull data from a chosen timeperiod):
test = sqlQuery(channel, paste("SELECT CAST(table1.IDNUM AS CHAR),PartyA,PartyB FROM
table1, table2 WHERE table1.IDNUM=table2.IDNUM AND table1.Timestamp>=",Interval[1],"
AND table2.Timestamp<",Interval[2],sep=""))
You will most likely want to read the documentation for the function you are using at ?sqlQuery, which includes notes about the following two relevant arguments:
as.is which (if any) columns returned as character should be
converted to another type? Allowed values are as for read.table. See
‘Details’.
and
stringsAsFactors logical: should columns returned as character and
not excluded by as.is and not converted to anything else be converted
to factors?
In all likelihood you want to specify the columns in questions in as.is.

Mysql Database shows 'b' in place of bit type field

Mysql Table shows value 'b' in place of bit type of data why??
How to convert it again into its original format does anybody know this??
I want values as 0 or 1 in these columns.
Taken from Bit-Field Literals
Beginning with MySQL 5.0.3, bit-field values can be written using
b'value' or 0bvalue notation. value is a binary value written using
zeros and ones.
Bit values are returned as binary values. To display them in printable
form, add 0 or use a conversion function such as BIN(). High-order 0
bits are not displayed in the converted value.
I found out the solution.
just call value using sql query,
this query will return only 0 or 1 for bit value though mysql represents value as 'b'.
Need not to worry.
I tried it as "select flag * 4 from table where id = 1" and answer was 0 as 0*4=0.