Floating Point Types comparisons - mysql

I have inserted diff values of pi (see below):
3.14
3.1415
3.14159
3.14159265359
I do not see the different in how the different floating point types handle the same values.
Code:
mysql> select * from test_types;
+---------+---------+---------+----------+
| flo | dub | deci | noomeric |
+---------+---------+---------+----------+
| 3.14000 | 3.14000 | 3.14000 | 3.14000 |
| 3.14150 | 3.14150 | 3.14150 | 3.14150 |
| 3.14159 | 3.14159 | 3.14159 | 3.14159 |
| 3.14150 | 3.14150 | 3.14150 | 3.14150 |
| 3.14159 | 3.14159 | 3.14159 | 3.14159 |
+---------+---------+---------+----------+
5 rows in set (0.00 sec)
mysql> describe test_types;
+----------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------------+------+-----+---------+-------+
| flo | float(10,5) | YES | | NULL | |
| noomeric | decimal(10,5) | YES | | NULL | |
| deci | decimal(10,5) | YES | | NULL | |
| dub | double(10,5) | YES | | NULL | |
+----------+---------------+------+-----+---------+-------+
4 rows in set (0.00 sec)
I can see here that when creating the table the field with numeric type used DECIMAL (see describe command table).
Does anybody know an example showing differences between FLOAT, DECIMAL and DOUBLE please?

FLOAT and DOUBLE are meant for very small values or very large values.
Essentially they are the same thing (except differ in storage size FLOAT 4 bytes against DOUBLE 8 bytes, see Data Type Storage Requirements)
The main thing about them is that they are approximate (see quoted from Oracle website):
Because floating-point values are approximate and not stored as exact
values, attempts to treat them as exact in comparisons may lead to
problems. They are also subject to platform or implementation
dependencies.
DECIMAL allows for an exact representation but the reason your DECIMAL column did not work for PI very well is because you allowed for only 5 decimal places but then you fed it 11 decimal places.
The best way to store the value of PI accurate to 11 decimal places is something like DECIMAL(12,11).
For an actual example for values being treated differently when stored as DECIMAL as opposed to same value being stored and used as a FLOAT see below:
CREATE TABLE decimal_vs_float_test
( dec DECIMAL(12,11)
, fl FLOAT
);
INSERT INTO decimal_vs_float_test VALUES
( 3.947947949 , 3.947947949 )
,( 3.777777777 , 3.777777777 )
,( 3.555555555 , 3.555555555 )
,( 3.333333333 , 3.333333333 )
,( 3.111111111 , 3.111111111 )
;
SELECT * FROM decimal_vs_float_test WHERE fl = dec
Now you can see the values for a DECIMAL or a FLOAT treated differently.
Hope that helps.
Additionally FLOAT and DOUBLE are floating binary point types whereas DECIMAL is a floating decimal point type.
See this answer for more exact details on what that means, the difference between how the types are encoded and when is best to use what type (its meant for C# but its still interesting).

Related

How can I migrate from "float" to "points" in MySQL?

I'm looking for a faster way to calculate Euclidean distances in SQL.
Problem I want to solve
The following "Euclidean distance calculation" is slow.
SELECT
id,
sqrt(
power(f1 - (-0.09077361), 2) +
power(f2 - (0.10373443), 2) +
...
...
power(f127 - (0.0778369), 2) +
power(f128 - (0.00951046), 2)
) as distance
FROM
face_feature
ORDER BY
distance
LIMIT
1
;
What I want to know
Can you share how to migrate from "float" to "points"?
I received the following advice, but I don't understand how.
Switch to POINTs and a SPATIAL index. It may be possible your task orders of magnitude faster.
MySQL
mysql> SHOW VARIABLES LIKE '%version%';
+--------------------------+------------------------------+
| Variable_name | Value |
+--------------------------+------------------------------+
| version | 8.0.29 |
| version_comment | MySQL Community Server - GPL |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
+--------------------------+------------------------------+
Table
mysql> desc face_feature;
+-------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+----------------+
| id | int | NO | PRI | NULL | auto_increment |
| f1 | float(9,8) | NO | | NULL | |
| f2 | float(9,8) | NO | | NULL | |
..
| f127 | float(9,8) | NO | | NULL | |
| f128 | float(9,8) | NO | | NULL | |
+-------+------------+------+-----+---------+----------------+
Data
mysql> SELECT count(*) FROM face_feature;
+----------+
| count(*) |
+----------+
| 100003 |
+----------+
mysql> SELECT * FROM face_feature LIMIT 1\G;
id: 1
f1: -0.07603023
f2: 0.13605964
...
f127: 0.09608927
f128: 0.00082345
Reference (My other question)
How can I make "euclidean distance calculation" faster in MySQL?
Don't use FLOAT(M,N) it adds an extra rounding that only hurts various operations.
FLOAT(9,8), if the numbers are near "1.0" will lose some precision. This is because there are only 24 bits of precision in any FLOAT.
(m,n) on FLOAT and DOUBLE has been deprecated (as useless and misleading) in newer versions of MySQL.
There are helper functions to convert numeric strings to POINT values. Internally, a POINT contains two DOUBLEs. Hence the original DECIMAL(9,8) loses only a round-from-decimal-to-binary at the 53rd significant bit.
But the real question is about using SPATIAL indexing when the universe has 128 dimensions. I don't think it will work. (I have not even heard of using SPATIAL for 3 dimensions, though it should be practical.)

How should I construct a database to store a lot of SHA1 data

I'm having trouble constructing a database to store a lot of SHA1 data and efficiently return results.
I will admit SQL is not my strongest skill but as an exercise I am trying to use the data from https://haveibeenpwned.com/Passwords which returns results pretty quickly
This is my data:
mysql> describe pwnd;
+----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| pwndpass | binary(20) | NO | | NULL | |
+----------+------------------+------+-----+---------+----------------+
mysql> select id, hex(pwndpass) from pwnd order by id desc limit 10;
+-----------+------------------------------------------+
| id | hex(pwndpass) |
+-----------+------------------------------------------+
| 306259512 | FFFFFFFEE791CBAC0F6305CAF0CEE06BBE131160 |
| 306259511 | FFFFFFF8A0382AA9C8D9536EFBA77F261815334D |
| 306259510 | FFFFFFF1A63ACC70BEA924C5DBABEE4B9B18C82D |
| 306259509 | FFFFFFE3C3C05FCB0B211FD0C23404F75E397E8F |
| 306259508 | FFFFFFD691D669D3364161E05538A6E81E80B7A3 |
| 306259507 | FFFFFFCC6BD39537AB7398B59CEC917C66A496EB |
| 306259506 | FFFFFFBFAD0B653BDAC698485C6D105F3C3682B2 |
| 306259505 | FFFFFFBBFC923A29A3B4931B63684CAAE48EAC4F |
| 306259504 | FFFFFFB58E389A0FB9A27D153798956187B1B786 |
| 306259503 | FFFFFFB54953F45EA030FF13619B930C96A9C0E3 |
+-----------+------------------------------------------+
10 rows in set (0.01 sec)
My question relates to quickly finding entries as it currently takes over 6 minutes
mysql> select hex(pwndpass) from pwnd where hex(pwndpass) = '0000000A1D4B746FAA3FD526FF6D5BC8052FDB38';
+------------------------------------------+
| hex(pwndpass) |
+------------------------------------------+
| 0000000A1D4B746FAA3FD526FF6D5BC8052FDB38 |
+------------------------------------------+
1 row in set (6 min 31.82 sec)
Do I have the correct data types? I search for storing sha1 data and a Binary(20) field is advised but not sure how to optimising it for searching the data.
My MySQL install is a clean turnkey VM https://www.turnkeylinux.org/mysql I have not adjusted any settings other than giving the VM more disk space
The two most obvious tips are:
Create an index on the column.
Don't convert every single row to hexadecimal on every search:
select hex(pwndpass)
from pwnd
where hex(pwndpass) = '0000000A1D4B746FAA3FD526FF6D5BC8052FDB38';
-- ^^^ This is forcing MySQL to convert every hash stored from binary to hexadecimal
-- so it can determine whether there's a match
In fact, you don't even need hexadecimal at all, save for display purposes:
select id, hex(pwndpass) -- This is fine, will just convert matching rows
from pwnd
where pwndpass = ?
... where ? is a placeholder that, in your client language, corresponds to a binary string.
If you need to run the query right in command-line, you can also use an hexadecimal literal:
select id, hex(pwndpass) -- This is fine, will just convert matching rows
from pwnd
where pwndpass = 0x0000000A1D4B746FAA3FD526FF6D5BC8052FDB38

Truncate column names in SELECT (MySQL client)

When I'm looking into new databases to explore what is there, usually I get tables with long column names but short contents, like:
mysql> select * from Seat limit 2;
+---------+---------------------+---------------+------------------+--------------+---------------+--------------+-------------+--------------+-------------+---------+---------+----------+------------+---------------+------------------+-----------+-------------+---------------+-----------------+---------------------+-------------------+-----------------+
| seat_id | seat_created | seat_event_id | seat_category_id | seat_user_id | seat_order_id | seat_item_id | seat_row_nr | seat_zone_id | seat_pmp_id | seat_nr | seat_ts | seat_sid | seat_price | seat_discount | seat_discount_id | seat_code | seat_status | seat_sales_id | seat_checked_by | seat_checked_date | seat_old_order_id | seat_old_status |
+---------+---------------------+---------------+------------------+--------------+---------------+--------------+-------------+--------------+-------------+---------+---------+----------+------------+---------------+------------------+-----------+-------------+---------------+-----------------+---------------------+-------------------+-----------------+
| 4897 | 2016-09-01 00:05:54 | 330 | 331 | NULL | NULL | NULL | 0 | NULL | NULL | 0 | NULL | NULL | NULL | 0.00 | NULL | NULL | free | NULL | NULL | 0000-00-00 00:00:00 | NULL | NULL |
| 4898 | 2016-09-01 00:05:54 | 330 | 331 | NULL | NULL | NULL | 0 | NULL | NULL | 0 | NULL | NULL | NULL | 0.00 | NULL | NULL | free | NULL | NULL | 0000-00-00 00:00:00 | NULL | NULL |
+---------+---------------------+---------------+------------------+--------------+---------------+--------------+-------------+--------------+-------------+---------+---------+----------+------------+---------------+------------------+-----------+-------------+---------------+-----------------+---------------------+-------------------+-----------------+
Since the length of the header is longer that the contents of each row, I see a unformatted output which is hard to standard, specially when you search for little clues like fields that aren't being used and so on.
Is there any way to tell mysql client to truncate column names automatically, for example, to 10 characters as maximum? With the first 10 character is usually enough to know which column they refer to.
Of course I could stablish column aliases for that with AS, but if there's too much columns and you want to do a fast exploration, that would take too long for each table.
Other solution will be to tell mysql to remove the prefix seat_ for each column for example (of course, for each column I would need to change the used prefix).
I don't think there's any way to do that automatically. Some options are:
1) Use a graphical UI such as PhpMyAdmin to view the table contents. These typically allow you to adjust column widths.
2) End the query with \G instead of ;:
mysql> SELECT * FROM seat LIMIT 2\G
This will display the columns horizontally instead of vertically:
seat_id: 4897
seat_created: 2016-09-01 00:05:54
seat_event_id: 330
...
I often use the latter for tables with lots of columns because reading the horizontal format can be difficult, especially when it wraps around on the terminal.
3) Use the less pager in a mode that doesn't wrap lines. You can then scroll left and right with the arrow keys.
mysql> pager less -S
See How to better display MySQL table on Terminal
You can skip the column names completely by running the MySQL client with the -N or --skip-column-names option. Then the width of your columns will be determined by the widest data, not the column name. But there would be no row for the column names.
You can also use column aliases to set your own column names, but you'd have to enter these yourself manually.

sql float zero padding

I am wondering if there is a way to convert a float number in SQL to a zero padded number after the decimal point for example if I have the following table:
.--------------------.-------.
| name | grade |
.--------------------.-------.
| courseE | 5 |
| courseG | 4 |
| courseB | 2.5 |
| courseC | 2.5 |
| courseF | 1.25 |
| courseD | 0 |
I want to convert the field grade to a zerro padded number and the result would be like this:
.--------------------.-------.
| name | grade |
.--------------------.-------.
| courseE | 5.000 |
| courseG | 4.000 |
| courseB | 2.500 |
| courseC | 2.500 |
| courseF | 1.250 |
| courseD | 0.000 |
I have tried to convert the field grade as float by usibg the CAST(EXP) AS FLOAT but why it did not work with me?!
thanks in advance
Your error starts at float - since there's no such data type available for CAST() conversion. Instead you should use DECIMAL, which allows you also to set decimal and numeric parts:
SELECT name, CAST(grade AS DECIMAL(4, 3)) FROM t
About format: first number indicates how many total digits will hold your decimal value, while second number indicates how many decimal digits will be (i.e. after dot). To be more precise, it's not format, it's certain data-type restriction definitions (since decimal is a special fixed-point data type in MySQL)
select convert(decimal(10, 3), #number)
The number "3" represents the number of decimals you want after the "."

Why MySQL adds extra digits to floats?

I have a table of prices.
Each price is a FLOAT with two digits after the dot.
From some reason, when I use the price in IF expression, the result is the same float with many additional digits:
mysql> select price, IF(1, price,0) as my_price from tbl_prices limit 10;
+-------+------------------+
| price | my_price |
+-------+------------------+
| 79.95 | 79.9499969482422 |
| 99.95 | 99.9499969482422 |
| 89.95 | 89.9499969482422 |
| 89.95 | 89.9499969482422 |
| 79.95 | 79.9499969482422 |
| 89.95 | 89.9499969482422 |
| 89.95 | 89.9499969482422 |
| 79.95 | 79.9499969482422 |
| 79.95 | 79.9499969482422 |
| 69.95 | 69.9499969482422 |
+-------+------------------+
10 rows in set (0.00 sec)
As you can see, price looks good, however the result of IF expression that returns the same price contains garbage.
Does anybody know what is the reason for this garbage, and how can I get rid of it (without using ROUND)?
Thanks in advance!
Just don't. A float is not an exact value. Use DECIMAL fields for example for a price.
Because you can't represent the .95 in floating point. This is the closest you will get. This is why float is approximate.
If you want exact decimal places, use DECIMAL
At first you should know how the floating point works.
Fortunately mysql provides DECIMAL data type, which can specify exact precision, for example:
DECIMAL( 10, 2)
Will store 10 decimal places long number and 2 digits out of that on right side, for example:
12345678.12
1.03
and so on.