Convert string to int in MySQL - mysql

I have a column with names like:
Ernest Hemingway
Jackson Pollock
I want to convert them to numbers and store them in an INT field. Maybe getting the position of each letter in the alphabet or something like this, resulting a number:
23764283456
23984623746
Is there any function to do something like this? I don't mind the length of the INT or if the result is one number or another. The important thing is that every time I apply the function to a name, the result is the same.
Thanks!

Try this:
crc32('Ernest Hemingway');
will always give you 2479642411

as #Gordon_Linoff said in the comments large number can't be store on filed of type int
but I will show you how to convert string to the ascii of the chars
you can use HEX
SELECT HEX('test')
+-------------+
| HEX('test') |
+-------------+
| 74657374 |
+-------------+

This is a one-way hash, but with an important concern: the integer should be representable on the platform.
PHP code, assuming 32-bit compatibility is desired:
$hash = sha1('Ernest Hemingway');
// last 6 characters, represent 3 bytes
$hash = substr($hash, -6);
$result = hexdec($hash); // integer: 1331016
Keep in mind this has a very low entropy: 2^24 = 16777216 possibilities
4 bytes is too large, because signed/unsigned integer discrepancies would lead to float with some inputs, and floats really can't be casted to integers with perfect determinism.

SELECT field,CONVERT(SUBSTRING_INDEX(field,'-',-1),UNSIGNED INTEGER) AS num
FROM table
ORDER BY num;

Related

MySQL Get the integer value between two strings

I am trying to get the integer value between two specific strings but I am stacked a little bit.
Example full string:
"The real ABC4_ string is probably company secret."
I need to get the "4" between "ABC" and "_". First I've came up with following script:
select substring_index(substring_index('The real ABC4_ string is probably company secret.', 'ABC', -1),'_', 1);
It gives me 4, perfect! But the problem is if ABC occurs more than one time in the string it fails. I can't simply increase the counter also since I don't know how many times it will be in the future. I have to get first occurrence of that regex: ABC[DIGIT]_
I've seen REGEXP_SUBSTR function but since we use older version of MySQL than 8.0 I can't use it also.
Any suggestions?
Thanks!
Without using Regex, here is an approach using LOCATE(), and other string functions:
SET #input_string = 'The real ABC4_ string is probably company secret.';
SELECT TRIM(LEADING 'ABC'
FROM SUBSTRING_INDEX(
SUBSTR(#input_string FROM
LOCATE('ABC', #input_string)
)
,'_', 1
)
) AS number_extracted;
| number_extracted |
| ---------------- |
| 4 |
View on DB Fiddle
Another way of (ab)using the LOCATE() function:
select substr('The real ABC4_ string is probably company secret.',
locate('ABC', 'The real ABC4_ string is probably company secret.') + 3,
locate('_','The real ABC4_ string is probably company secret.') -
locate('ABC', 'The real ABC4_ string is probably company secret.') - 3) AS num;

How to add trailing space after some number of character (varchar) using sql stored procedure?

I need to display output data, I need to add/replace with white spaces with fix length for each column. Each column may got different length.
select top (20)
patname + REPLICATE(' ', 30 - DATALENGTH(patname)) AS NAME,
RefNo + REPLICATE(' ', 15 - DATALENGTH(RefNo)) AS REFNO,
ClaimAmt AS AMOUNT
from AR_Ebilling
Field length need to be set:
NAME = 30 varchar, REFNO = 15 varchar, AMOUNT = 10 money/decimal,
Expected Output Result :
NAME REFNO AMOUNT
Ahmad Kasan 1235 00000000565.93
Amirah AY582M8D -00023441200.23
Paul 0ST127 00000004234.45
TQ
The answer is that don't! One does not store redundant data in databases. All those white spaces are not really required to be stored because it's a trivial matter to pad out a string in your favourite programming language or even with in mysql itself. Towards this end mysql has :
RPAD
Returns the string str, right-padded with the string padstr to a
length of len characters. If str is longer than len, the return value
is shortened to len characters.
And
LPAD
Returns the string str, left-padded with the string padstr to a length
of len characters. If str is longer than len, the return value is
shortened to len characters.
Also note that needlessly storing padded strings in the table makes it futile to use VARCHAR.

Best datatype to store a long number made of 0 and 1

I want to know what's the best datatype to store these:
null
0
/* the length of other numbers is always 7 digits */
0000000
0000001
0000010
0000011
/* and so on */
1111111
I have tested, INT works as well. But there is a better datatype. Because all my numbers are made of 0 or 1 digits. Is there any better datatype?
What you are showing are binary numbers
0000000 = 0
0000001 = 2^0 = 1
0000010 = 2^1 = 2
0000011 = 2^0 + 2^1 = 3
So simply store these numbers in an integer data type (which is internally stored with bits as shown of course). You could use BIGINT for this, as recommended in the docs for bitwise operations (http://dev.mysql.com/doc/refman/5.7/en/bit-functions.html).
Here is how to set flag n:
UPDATE mytable
SET bitmask = POW(2, n-1)
WHERE id = 12345;
Here is how to add a flag:
UPDATE mytable
SET bitmask = bitmask | POW(2, n-1)
WHERE id = 12345;
Here is how to check a flag:
SELECT *
FROM mytable
WHERE bitmask & POW(2, n-1)
But as mentioned in the comments: In a relational database you usually use columns and tables to show attributes and relations rather than an encoded flag list.
As you've said in a comment, the values 01 and 1 should not be treated as equivalent (which rules out binary where they would be), so you could just store as a string.
It actually might be more efficient than storing as a byte + offset since that would take up 9 characters, whereas you need a maximum of 7 characters
Simply store as a varchar(7) or whatever the equivalent is in MySql. No need to be clever about it, especially since you are interested in extracting positional values.
Don't forget to bear in mind that this takes up a lot more storage than storing as a bit(7), since you are essentially storing 7 bytes (or whatever the storage unit is for each level of precision in a varchar), not 7 bits.
If that's not an issue then no need to over-engineer it.
You could convert the binary number to a string, with an additional byte to specify the number of leading zeros.
Example - the representation of 010:
The numeric value in hex is 0x02.
There is one leading zero, so the first byte is 0x01.
The result string is 0x01,0x02.
With the same method, 1010010 should be represented as 0x00,0x52.
Seems to me pretty efficient.
Not sure if it is the best datatype, but you may want to try BIT:
MySQL, PostgreSQL
There are also some useful bit functions in MySQL.

I need a trigger to create id's in my sql database with a string and some zeros

I'm currently using this trigger which adds id's with 3 zeros and two zeros and then the id from the sequences table.
BEGIN
INSERT INTO sequences VALUES (NULL);
SET NEW.deelnemernr = CONCAT('ztmr16', LPAD(LAST_INSERT_ID(), 3, '0'));
END
I changed the 3 to 4 but then it didn't increment the id anymore, resulting in and multiple id error. It stayed at ztmr16000. So what can I do to add more zeros and still get the id from the sequencestable?
The MySQL LPAD function limits the number of characters returned to the specified length.
The specification is a bit unclear, what you are trying to achieve.
If I need a fixed length string with leading zeros, my approach would be to prepend a boatload of zeros to my value, and then take the rightmost string, effectively lopping off extra zeros from the front.
To format a non-negative integer value val into a string that is ten characters in length, with the leading characters as zeros, I'd do something like this:
RIGHT(CONCAT('000000000',val),10)
As a demonstration:
SELECT RIGHT(CONCAT('000000000','123456789'),10) --> 0123456789
SELECT RIGHT(CONCAT('000000000','12345'),10) --> 0000012345
Also, I'd be cognizant of the maximum length allowed in the column I was populating, and be sure that the length of the value I was generating didn't exceed that, to avoid data truncation.
If the value being returned isn't be truncated when it's inserted into the column, then what I think the behavior you observe is due to the value returned from LAST_INSERT_ID() exceeding 1000.
Note that for a non-negative integer value val, the expression
LPAD(val,3,'0')
will allow at most 1000 distinct values. LPAD (as I noted earlier) restricts the length of the returned string. In this example, to three characters. As a demonstration of the behavior:
SELECT LPAD( 21,3,'0') --> 021
SELECT LPAD( 321,3,'0') --> 321
SELECT LPAD( 54321,3,'0') --> 543
SELECT LPAD( 54387,3,'0') --> 543
There's nothing illegal with doing that. But you're going to be in trouble if you depend on that to generate "unique" values.
FOLLOWUP
As stated, the specification ...
"adds id's with 3 zeros and two zeros and then the id from the sequences table."
is very unclear. What is it exactly that you want to achieve? Consider providing some examples. It doesn't seem like there's an issue concatenating something to those first five fixed characters. The issue seems to be with getting the id value "formatted" to your specification
This is just a guess of what you are trying to achieve:
id value formatted return
-------- ----------------
1 0001
9 0009
22 0022
99 0099
333 0333
4444 4444
55555 55555
666666 666666
You could achieve that with something like this:
BEGIN
DECLARE v_id BIGINT;
INSERT INTO sequences VALUES (NULL);
SELECT LAST_INSERT_ID() INTO v_id;
IF ( v_id <= 9999 ) THEN
SET NEW.deelnemernr = CONCAT('ztmr16',LPAD(v_id,4,'0'));
ELSE
SET NEW.deelnemernr = CONCAT('ztmr16',v_id);
END IF;
END

MySQL string cast to unsigned

If I have a string that starts with a number, then contains non-numeric characters, casting this string to an integer in MySQL will cast the first part of the string, and give no indication that it ran into any problems! This is rather annoying.
For example:
SELECT CAST('123' AS UNSIGNED) AS WORKS,
CAST('123J45' AS UNSIGNED) AS SHOULDNT_WORK,
CAST('J123' AS UNSIGNED) AS DOESNT_WORK
returns:
+-------------+---------------+-------------+
| WORKS | SHOULDNT_WORK | DOESNT_WORK |
+-------------+---------------+-------------+
| 123 | 123 | 0 |
+-------------+---------------+-------------+
This doesn't make any sense to me, as clearly, 123J45 is not a number, and certainly does not equal 123. Here's my use case:
I have a field that contains (some malformed) zip codes. There may be mistypes, missing data, etc., and that's okay from my perspective. Because of another table storing Zip Codes as integers, when I join the tables, I need to cast the string Zip Codes to integers (I would have to pad with 0s if I was going the other way). However, if for some reason there's an entry that contains 6023JZ1, in no way would I want that to be interpreted as Zip Code 06023. I am much happier with 6023JZ1 getting mapped to NULL. Unfortunately, IF(CAST(zipcode AS UNSIGNED) <= 0, NULL, CAST(zipcode AS UNSIGNED)) doesn't work because of the problem discussed above.
How do I control for this?
Use a regular expression:
select (case when val rlike '[0-9][0-9][0-9][0-9][0-9]' then cast(val as unsigned)
end)
Many people consider it a nice feature that MySQL does not automatically produce an error when doing this conversion.
One options is to test for just digit characters 0 thru 9 for the entire length of the string:
zipstr REGEXP '^[0-9]+$'
Based on the result of that boolean, you could return the integer value, or a NULL.
SELECT IF(zipstr REGEXP '^[0-9]+$',zipstr+0,NULL) AS zipnum ...
(note: the addition of zero is an implicit conversion to numeric)
Another option is to do the conversion like you are doing, and cast the numeric value back to character, and compare to the original string, to return a boolean:
CAST( zipstr+0 AS CHAR) = zipstr
(note: this second approach does allow for a decimal point, e.g.
CAST( '123.4'+0 AS CHAR ) = '123.4' => 1
which may not be desirable if you are looking for just a valid integer