MySQL bitwise AND 256-bit binary values - mysql

I'm intending on storing a 256-bit long binary value in a MySQL table column.
Which column type should I be using (blob?) such that I can run bitwise operations against it (example of an AND would be ideal).

I don't think you could find some way to perform bit-wise operation on 256-bit values at SQL level as the doc clearly state that:
MySQL uses BIGINT (64-bit) arithmetic for bit operations, so these operators have a maximum range of 64 bits.
http://dev.mysql.com/doc/refman/5.5/en/bit-functions.html#operator_bitwise-and
As for storing those values, TINYBLOB is possible, but my personal preference would go to simply BINARY(32) (a binary string of 32 bytes -- 256-bits).
While writing this, one trick came to my mind. If we are limited to 64-bit values (BIGINT UNSIGNED), why not store your 256-bit as 4 words of 64-bits. Not very elegant but that would work. Especially here since you only need bitwise operations:
ABCD32 & WXYZ32 == A8 & W8, B8 & X8, C8 & Y8, D8 & Z8
Very basically:
create table t (a bigint unsigned,
b bigint unsigned,
c bigint unsigned,
d bigint unsigned);
While inserting, 256-bit values has to be "split" on 4 words:
-- Here I use hexadecimal notation for conciseness. you may use b'010....000' if you want
insert into t values (0xFFFFFFFF,
0xFFFF0000,
0xFF00FF00,
0xF0F0F0F0);
You could easily query the 256-bit value:
mysql> select CONCAT(LPAD(HEX(a),8,'0'),
LPAD(HEX(b),8,'0'),
LPAD(HEX(c),8,'0'),
LPAD(HEX(d),8,'0')) from t;
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| CONCAT(LPAD(HEX(a),8,'0'),
LPAD(HEX(b),8,'0'),
LPAD(HEX(c),8,'0'),
LPAD(HEX(d),8,'0')) |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| FFFFFFFFFFFF0000FF00FF00F0F0F0F0 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
I used hexadecimal here again, but you could display as binary by replacing ̀HEX() by BIN()
And last but not least you could perform binary operation on them. Once again, you just have to "split" the operand. Assuming I want to apply the 256 bits mask 0xFFFFFFFFFFFFFFFF0000000000000000 to all values in the table:
update t set a = a & 0xFFFFFFFF,
b = b & 0xFFFFFFFF,
c = c & 0x00000000,
d = d & 0x00000000;

Looks like blob works with a query like this for the bitwise and:
select id,bin(label & b'01000000010000001000000000000000000') from projects;

Related

In SQL - how can I count the number of times Bit(0), Bit(1), ... Bit(N) are high for a decimal number?

I am dealing with a table of decimal values that represent binary numbers. My goal is to count the number of times Bit(0), Bit(1),... Bit(n) are high.
For example, if a table entry is 5 this converts to '101' which can be done using the BIN() function.
What I would like to do is increment a variable 'bit0Count' and 'bit2Count'
I have looked into the BIT_COUNT() function however this would only return 2 for the above example.
Any insight would be greatly appreciated.
SELECT SUM(n & (1<<2) > 0) AS bit2Count FROM ...
The & operator is a bitwise AND.
1<<2 is a number with only 1 bit set, left-shifted by two places, so it is binary 100. Using bitwise AND against you column n is either binary 100 or binary 000.
Testing that with > 0 returns either 1 or 0, since in MySQL, boolean results are literally the integers 1 for true and 0 for false (note this is not standard in other implementations of SQL).
Then you can SUM() these 1's and 0's to get a count of the occurrences where the bit was set.
To tell if bit N is set, use 1 << N to create a mask for that bit and then use bitwise AND to test it. So (column & (1 << N)) != 0 will be 1 if bit N is set, 0 if it's not set.
To total these across rows, use the SUM() aggregation function.
If you need to do this frequently, you could define a stored function:
CREATE FUNCTION bit_set(UNSIGNED INT val, TINYINT which) DETERMINISTIC
RETURN (val & (1 << which)) != 0;

mySQL column without a one-size-fits-all precision for DECIMAL

When I define a table to store decimal values I use a statement like this:
CREATE TABLE myTable (
myKey INT NOT NULL,
myValue DECIMAL(10,2) NOT NULL,
PRIMARY KEY (myKey)
);
However, this results in every myValue being stored with a one-size-fits-all precision of (10,2). For instance
45.6 becomes 45.60
21 becomes 21.00
17.008 becomes 17.01
But what if each record has a myValue of different precision? I need 45.6 to remain 45.6, 21 to remain 21, and 17.008 to remain 17.008. Otherwise the precision of measurement is being lost. There's a big difference between 21 and 21.00.
If you don't need to do greater/less-than compares, store as a VARCHAR(..)
The strings '21' and '21.00' would have identical values, but present different "precision".
When needing the numeric value, add zero (col + 0).
This does not allow for "negative precision", such as "1.2M" being represented as 1200000. If you need that, then Norbert's approach is probably better.
You can store with high precision and exact recall by following a different way of storing the data:
Create a table with two columns:
CREATE TABLE precise (value BIGINT, decimaldot INT);
Use code to determine where the dot is, for example in your 21 value: 2 (assuming 1 indexing). So stored the value would be:
INSERT INTO precise values (21,2);
Retrieved it would return 21 exact (parsing back the dot in the value 21 at position 2, is 21)
Value 17.008 would also have decimaldot at 2:
INSERT INTO precise values (17008,2);
Etc..
Larger values can be stored by using a VARCHAR(4000) instead of a biginteger, or by using blob fields.

Better way to get number of bits different between two 128-bit MySQL binary values?

I'm using a MySQL binary column (tinyblob) to store a 128-bit perceptual image hash for about 200,000 images, and then doing a SELECT query to find images whose hash value is within a certain number of bits different (the hamming distance is less than a given delta).
To count the number of bits different, you can XOR the two values and then count the number of 1 bits in the result. MySQL has a handy function called BIT_COUNT that counts the number of 1 bits in an unsigned 64-bit integer.
So I'm currently using the following query to split the 128-bit hash into two 64-bit parts, doing the two XOR and BIT_COUNT operations, and adding the results to get the total bit delta:
SELECT asset_id, dhash8
FROM assets
WHERE
BIT_COUNT(CAST(CONV(HEX(SUBSTRING(dhash8, 1, 8)), 16, 10)
AS UNSIGNED) ^ :dhash8_0) + -- high part
BIT_COUNT(CAST(CONV(HEX(SUBSTRING(dhash8, 9, 8)), 16, 10)
AS UNSIGNED) ^ :dhash8_1) -- plus low part
<= :delta -- less than threshold?
But doing a substring, and especially converting it to a hex string and back is kind of annoying (and inefficient). Is there a better way to do this using MySQL?

Best datatype to store a long number made of 0 and 1

I want to know what's the best datatype to store these:
null
0
/* the length of other numbers is always 7 digits */
0000000
0000001
0000010
0000011
/* and so on */
1111111
I have tested, INT works as well. But there is a better datatype. Because all my numbers are made of 0 or 1 digits. Is there any better datatype?
What you are showing are binary numbers
0000000 = 0
0000001 = 2^0 = 1
0000010 = 2^1 = 2
0000011 = 2^0 + 2^1 = 3
So simply store these numbers in an integer data type (which is internally stored with bits as shown of course). You could use BIGINT for this, as recommended in the docs for bitwise operations (http://dev.mysql.com/doc/refman/5.7/en/bit-functions.html).
Here is how to set flag n:
UPDATE mytable
SET bitmask = POW(2, n-1)
WHERE id = 12345;
Here is how to add a flag:
UPDATE mytable
SET bitmask = bitmask | POW(2, n-1)
WHERE id = 12345;
Here is how to check a flag:
SELECT *
FROM mytable
WHERE bitmask & POW(2, n-1)
But as mentioned in the comments: In a relational database you usually use columns and tables to show attributes and relations rather than an encoded flag list.
As you've said in a comment, the values 01 and 1 should not be treated as equivalent (which rules out binary where they would be), so you could just store as a string.
It actually might be more efficient than storing as a byte + offset since that would take up 9 characters, whereas you need a maximum of 7 characters
Simply store as a varchar(7) or whatever the equivalent is in MySql. No need to be clever about it, especially since you are interested in extracting positional values.
Don't forget to bear in mind that this takes up a lot more storage than storing as a bit(7), since you are essentially storing 7 bytes (or whatever the storage unit is for each level of precision in a varchar), not 7 bits.
If that's not an issue then no need to over-engineer it.
You could convert the binary number to a string, with an additional byte to specify the number of leading zeros.
Example - the representation of 010:
The numeric value in hex is 0x02.
There is one leading zero, so the first byte is 0x01.
The result string is 0x01,0x02.
With the same method, 1010010 should be represented as 0x00,0x52.
Seems to me pretty efficient.
Not sure if it is the best datatype, but you may want to try BIT:
MySQL, PostgreSQL
There are also some useful bit functions in MySQL.

Single quotes affecting the calculations in Select query

SELECT COUNT(*) FROM area
WHERE ROUND(SQRT(POWER(('71' - coords_x), 2) +
POWER(('97' - coords_y), 2))) <= 17
==> 51
SELECT COUNT(*) FROM area
WHERE ROUND(SQRT(POWER((71 - coords_x), 2) +
POWER((97 - coords_y), 2))) <= 17
==> 22
coords_x and coords_y are both TINYINT fields containing values in the range [1, 150]. Usually MySQL doesn't care if numbers are quoted or not.. but apparently it does in this case.
The question is just: Why?
MySQL always cares about data types. What happens is that your code relies in automatic type casting and performs math on strings (which can hold a number or not). This can lead to all sort of unpredictable results:
SELECT POW('Hello', 'World') -- This returns 1
To sum up: you need to learn and use the different data types MySQL offers. Otherwise, your application will never do reliable calculations.
Update:
One more hint:
TINYINT[(M)] [UNSIGNED] [ZEROFILL]
A very small integer. The signed range
is -128 to 127. The unsigned range is
0 to 255.
URL:
http://dev.mysql.com/doc/refman/5.1/en/numeric-type-overview.html
I hope you are not trying to store 150 in a signed tinyint column.