How to bitwise OR into a binary(100)? - mysql

I have a binary(100), and I want to bitwise OR just one of its bytes with a constant.
Any idea how this would be done?
Alternatively, how can I store a value into a byte of a binary(100)?

Firstly, consider whether BINARY is actually the appropriate field type. When compared to BLOB it has a potentially nasty "feature" of stripping trailing spaces. BINARY is really designed to be just a case-insenstive binary text string, and not a blob of arbitrary binary data.
If you do use a blob, you'd need to use the SUBSTRING() operator combined with ASCII() to extract just the byte you want, then use the | bitwise operator.
To set something in the second byte you'd need to use something like:
UPDATE TABLE SET col = CONCAT(
SUBSTR(col, 1, 1),
CHAR(ASCII(SUBSTR(col, 2, 1) | 0x80)),
SUBSTR(col, 3)
)
A possibly simpler solution might be to treat your 100 bytes as 12.5 lots of 64 bits (i.e. BIGINT), and then use direct bitwise operations on individual words.

Related

Using large numbers in sql query

I have a field called size which is a BIGINT storing the number of bytes in a file. To get a file that is larger than 1GB I am currently doing:
size > (1024*1024*1024)
But this looks a bit hairy. Is there another way to write this that makes it more clear that the result of 1024*1024*1024 is 1GiB?
Additionally is the exponent operator built into mysql? I've used
select power(2, 30)
But I was wondering if there was a shortform to do that directly in the query, such as 2^30.
^ is the bitwise xor operator.
Either POW(2,30) or POWER(2,30) (or POWER(1024,3)) will work; I believe of the two POWER is the more standard. There is no typographic operator for exponentiation.
I would just leave it as 1024*1024*1024; to me that provides the best readability (and makes it clear it is 1 GiB, not 1 GB).

MySQL: parse and cast strings which contain numbers with units

I have a table that has a column holding string values that are numbers and units. The values contain a numerical value in the prefix composed of integers and one decimal.
Some examples of these values would be following:
"16 GB", "8.5gb", "15.99345 GHz", "25L"
Is there a way I can use the cast function to first parse the string values that contain numbers and decimals and only do the cast on that portion of the values?
This is what I had in mind
select * from my_table
where cast( numparse( my_column ) as signed ) > 10
Thanks in advance, I'm fairly new to SQL so any help would be appreciated.
Yes, you could write a stored procedure that does some sort of string parsing, or use a regex as in #ladd2025's answer...
But then you'd be redoing this conversion on every query. There's the cost of the conversion, but it also means you cannot take advantage of indexing. A query like where parse_the_thing( thing ) > 10 has to do a full table scan. Whereas if thing were an indexed number to begin with where thing > 10 is a very fast indexed query. This a problem with storing "formatted" information, you have to strip the formatting every time you want to do something with it.
You'd be far better off normalizing your stored data to store the magnitude as a numeric data type such as bigint, double, or numeric, and the unit as an enum column. Or consider if it makes sense to store all these different units in the same table; does it make sense to compare 8.5 gb with 15.99 Ghz?
8.5gb stored in bytes would become the bigint 8,500,000,000 (the exact value depends on whether it's 1000 bytes or 1024 bytes) with the unit bytes. 15.99345 GHz might become the bigint 15,993,450,000 with the unit Hz. And so on.
You can accomplish this by adding the new columns to your table, and doing the update to convert from the strings to the units and quantity. And then change whatever is inputting the values to do the same. You can continue to store the original human formatted string if you like, but you might be better off not and applying the formatting as needed.
This makes your queries much simpler, less chance of bugs. And they can take advantage of indexing, so they'll be much, much faster.
You could use REGEXP_REPLACE:
SELECT *
FROM tab
WHERE CAST(REGEXP_REPLACE(my_column, '[^0-9/.]', '') AS signed) > 10;
DBFiddle Demo
Just use the CAST() function. If you're casting to a numeric type, it will just parse the prefix and ignore the rest.
mysql> select cast('12.45gb' as signed);
+---------------------------+
| cast('12.45gb' as signed) |
+---------------------------+
| 12 |
+---------------------------+

MySQL hex strings low bytes first

If I insert 0xFF into a binary column, MySql (5.7) assumes these are the high bytes.
e.g. if the column is BINARY(2):
+--------------------+
| HEX(binary_column) |
+--------------------+
| FF00 |
+--------------------+
Just for convenience, how would you get MySql to interpret a hex string normally?
P.S. Also tried UNHEX()
binary is not really a numerical datatype. It is a special type of string used to store binary data like files. In contrast to e.g. char, binary does not have a character map and comparisons are done with the numerical code.
That behaviour is similar as to how other programming languages treat strings and byte arrays, and is expected in mysql too, see The BINARY and VARBINARY Type:
When BINARY values are stored, they are right-padded with the pad value to the specified length. The pad value is 0x00 (the zero byte). Values are right-padded with 0x00 on insert, and no trailing bytes are removed on select. All bytes are significant in comparisons, including ORDER BY and DISTINCT operations. 0x00 bytes and spaces are different in comparisons, with 0x00 < space.
You seem to look for binary numbers, so you may want to use a numeric type. You can use e.g. int (or bit(16)) and still insert values like 0xFF (just not as '0xFF' without further casting), and you still can display them with e.g. hex(0xFF) in the way you want.
If you want to use binary values (or need large values > 8 byte), you can use lpad to fill them with leading zeros, e.g.
select hex(lpad(0xFF,2,0x0))
You have to know (or query) the size of your column, and you will probably run into a lot of issues with this, starting with the simple task of adding two binary values. So to keep it simple, use a numeric type.

Are there any illegal characters in MySQL which may not be stored in a field?

I'm looking for a shorthand solution to storing an md5 hash inside of a MySQL table, as string data. I had the idea that base256 could reduce the length of the string by half, down to a 16 digit string instead of 32 digits of hex. So I take hex and divide it up into chunks of two digits programatically then convert each set of two digits to ASCII. For example:
4cf5f5941a02573dc007e60442f5358a
is shortened to
Lõõ”W=ÀæBõ5Š
and it's OK if these characters don't print properly - I just need to store them. Would MySQL accept that sort of ASCII data into a text field without complaining?
MySQL will accept these values, but you must be very carefull when writing them - I strongly suggest binding parameters.
You might want to look into COMPRESS() and UNCOMPRESS() as an alternative:
INSERT INTO ... SET hashcode=COMPRESS('4cf5f5941a02573dc007e60442f5358a');
and
SELECT UNCOMPRESS(hashcode) AS hashcode FROM ... WHERE
might do the trick more readable

using binary text comparison in mysql - efficiency pitfalls?

I'm using binary string comparison on user names (defined as varchar) to be sure the string matches exactly:
... where binary Owner = '$user' ...
or
... from Records join Users on binary Records.Owner = Users.User ....
But I'm not sure if it has some (either positive or negative) impact on efficiency. The manual states:
Note that in some contexts, if you cast an indexed column to BINARY,
MySQL is not able to use the index efficiently.
Is it an issue in this case? I would expect binary to be faster on the contrary, because it doesn't have to ignore case, whitespace, some accents etc.
Indeed, it's possible for an expression in a WHERE clause to make it impossible to use an index on a column. You wouldn't expect an index on a float column called x to be any use for searching
WHERE SIN(x) BETWEEN 0.0 and 0.1
for example.
The answer to your problem is to define your Owner column to have a binary collation rather than a case-insensitive collation. See here for how to do that: http://dev.mysql.com/doc/refman/5.0/en/charset-column.html