Range of unsigned FLOAT values in MySQL - mysql

According to the MySQL Numeric types documentation, among other, Float types have these properties:
FLOAT[(M,D)] [UNSIGNED] [ZEROFILL]
A small (single-precision) floating-point number. Permissible values are -3.402823466E+38 to -1.175494351E-38, 0, and 1.175494351E-38 to 3.402823466E+38. These are the theoretical limits, based on the IEEE standard. The actual range might be slightly smaller depending on your hardware or operating system.
M is the total number of digits and D is the number of digits following the decimal point. If M and D are omitted, values are stored to the limits permitted by the hardware. A single-precision floating-point number is accurate to approximately 7 decimal places.
UNSIGNED, if specified, disallows negative values.
I want to determine if the behavior of unsigned float is similar to the behavior of unsigned int where the the full bit depth of the decimal and mantissa is used to represent positive values, or if unsigned floats only prevent negative values.

I want to determine if the behavior of unsigned float is similar to
the behavior of unsigned int where the the full bit depth of the
decimal and mantissa is used to represent positive values, or
No. MySQL chose to use IEEE-754 Single precision Float for its FLOAT. IEEE Float is "sign-magnitude", so the sign bit occupies a specific location of its own. INT, on the other hand, is 2's complement encoding. So, the bit for the sign can be either interpreted as a sign or as an extension of the value. (Caveat: Virtually all current computer hardware works the way described in this paragraph; but there may be some that do not. An antique example: The CDC 6600 used 1's complement.)
if unsigned floats only prevent negative values.
Yes.
It has to be said: If you're using the inherently imprecise FLOAT data type you should think through why your problem space prohibits all negative numbers. Do those numbers faithfully represent the problem space? Is FLOAT's imprecision acceptable for what you're representing?
The idea that there can be no computation on floating-point numbers is illusory. Just generating them from integers involves computation. Yes, it's done directly on the computer chip, but it's still computation.
And, the example of packet loss is
precisely represented by two integers: number of packets lost / number of packets (you could call this a rational number if you wanted), or
approximately represented by a FLOAT fraction resulting from dividing one of those integers by the other.
With respect, you're overthinking this.
FLOAT always occupies 4 bytes, regardless of the options tacked onto it. DECIMAL(m,n), on the other hand, occupies approx (m/2) bytes.
FLOAT(m,n), is probably useless. The n says to round in decimal to n places, then store into a binary number (IEEE Float), thereby causing another rounding.

Related

Negative fixed point number representation

I am writing a generic routine for converting fixed-point numbers between decimal and binary representations.
For positive numbers the processing is simple, however when things come to negative ones I found divergent sources. Someone says there is a single bit used to hold the sign while others say the whole number should be represented in a pseudo integer using 2's complement even it is negative.
Please anyone tell me which source is correct or is there a standard representation for signed fixed point numbers?
Additionally, if the 2's complement representation was correct then how to represent negative numbers with zero integer part. For example -0.125?
Fixed-point numbers are just binary values where the place values have been changed. Assigning place values to the bits is an arbitrary human activity, and we can do it in any way that makes sense. Normally we talk about binary integers so it is convenient to assign the place value 2^0 = 1 to the LSB, 2^1=2 to the bit to the left of the LSB, and so on. For an N bit integer the place value of the MSB becomes 2^(N-1). If we want a two's-complement representation, we change the place value of the MSB to -2^(N-1) and all of the other bit place values are unchanged.
For fixed-point values, if we want F bits to represent a fractional part of the number, then the place value of the LSB becomes 2^(0-F)
and the place value of the MSB becomes 2^(N-1-F) for unsigned numbers and -2^(N-1-F) for signed numbers.
So, how would we represent -0.125 in a two's-complement fixed-point value? That is equal to 0.875 - 1, so we can use a representation where the place value of the MSB is -1 and the value of all of the other bits adds up to 0.875. If you choose a
4-bit fixed-point number with 3 fraction bits you would say that
1111 binary equals -0.125 decimal. Adding up the place values of the bits we have (-1) + 0.5 + 0.25 + 0.125 = -0.125. My personal preference is to write the binary number as 1.111 to note which bits are fraction and which are integer.
The reason we use this approach is that the normal integer arithmetic operators still work.
It's easiest to think of fixed-point numbers as scaled integers — rather than shifted integers. For a given fixed-point type, there is a fixed scale which is a power of two (or ten). To convert from the real value to the integer representation, multiply by that scale. To convert back again, simply divide. Then the issue of how negative values are represented becomes a detail of the integer type with which you are representing your number.
Please anyone tell me which source is correct...
Both are problematic.
Your first source is incorrect. The given example is not...
the same as 2's complement numbers.
In two’s complement, the MSB's (most significant bit's) weight is negated but the other bits still contribute positive values. Thus a two’s complement number with all bits set to 1 does not produce the minimum value.
Your second source could be a little misleading where it says...
shifting the bit pattern of a number to the right by 1 bit always divide the number by 2.
This statement brushes over the matter of underflow that occurs when the LSB (least significant bit) is set to 1, and the resultant rounding. Right-shifting commonly results in rounding towards negative infinity while division results in rounding towards zero (truncation). Both produce the same behavior for positive numbers: 3/2 == 1 and 3>>1 == 1. For negative numbers, they are contrary: -3/2 == -1 but -3>>1 == -2.
...is there a standard representation for signed fixed point numbers?
I don't think so. There are language-specific standards, e.g. ISO/IEC TR 18037 (draft). But the convention of scaling integers to approximate real numbers of predetermined range and resolution is well established. How the underlying integers are represented is another matter.
Additionally, if the 2's complement representation was correct then how to represent negative numbers with zero integer part. For example -0.125?
That depends on the format of your integer and your choice of radix. Assuming a 16-bit two’s complement number representing binary fixed-point values, the scaling factor is 2^15 which is 32,768. Multiply the value to store as an integer: -0.125*32768. == -4096 and divide to retrieve it: -4096/32768. == -0.125.

Storing statistical data, do I need DECIMAL, FLOAT or DOUBLE?

I am creating for fun, but I still want to approach it seriously, a site which hosts various tests. With these tests I hope to collect statistical data.
Some of the data will include the percentage of the completeness of the tests as they are timed. I can easily compute the percentage of the tests but I would like true data to be returned as I store the various different values concerning the tests on completion.
Most of the values are, in PHP floats, so my question is, if I want true statistical data should I store them in MYSQL as FLOAT, DOUBLE or DECIMAL.
I would like to utilize MYSQL'S functions such as AVG() and LOG10() as well as TRUNCATE(). For MYSQL to return true data based off of my values that I insert, what should I use as the database column choice.
I ask because some numbers may or may not be floats such as, 10, 10.89, 99.09, or simply 0.
But I would like true and valid statistical data to be returned.
Can I rely on floating point math for this?
EDIT
I know this is a generic question, and I apologise extensively, but for non mathematicians like myself, also I am not a MYSQL expert, I would like an opinion of an expert in this field.
I have done my research but I still feel I have a clouded judgement on the matter. Again I apologise if my question is off topic or not suitable for this site.
This link does a good job of explaining what you are looking for. Here is what is says:
All these three Types, can be specified by the following Parameters (size, d). Where size is the total size of the String, and d represents precision. E.g To store a Number like 1234.567, you will set the Datatype to DOUBLE(7, 3) where 7 is the total number of digits and 3 is the number of digits to follow the decimal point.
FLOAT and DOUBLE, both represent floating point numbers. A FLOAT is for single-precision, while a DOUBLE is for double-precision numbers. A precision from 0 to 23 results in a 4-byte single-precision FLOAT column. A precision from 24 to 53 results in an 8-byte double-precision DOUBLE column. FLOAT is accurate to approximately 7 decimal places, and DOUBLE upto 14.
Decimal’s declaration and functioning is similar to Double. But there is one big difference between floating point values and decimal (numeric) values. We use DECIMAL data type to store exact numeric values, where we do not want precision but exact and accurate values. A Decimal type can store a Maximum of 65 Digits, with 30 digits after decimal point.
So, for the most accurate and precise value, Decimal would be the best option.
Unless you are storing decimal data (i.e. currency), you should use a standard floating point type (FLOAT or DOUBLE). DECIMAL is a fixed point type, so can overflow when computing things like SUM, and will be ridiculously inaccurate for LOG10.
There is nothing "less precise" about binary floating point types, in fact, they will be much more accurate (and faster) for your needs. Go with DOUBLE.
Decimal : Fixed-Point Types (Exact Value). Use it when you care about exact precision like money.
Example: salary DECIMAL(8,2), 8 is the total number of digits, 2 is the number of decimal places. salary will be in the range of -999999.99 to 999999.99
Float, Double : Floating-Point Types (Approximate Value). Float uses 4 bytes to represent value, Double uses 8 bytes to represent value.
Example: percentage FLOAT(5,2), same as the type decimal, 5 is total digits and 2 is the decimal places. percentage will store values between -999.99 to 999.99.
Note that they are approximate value, in this case:
Value like 1 / 3.0 = 0.3333333... will be stored as 0.33 (2 decimal place)
Value like 33.009 will be stored as 33.01 (rounding to 2 decimal place)
Put it simply, Float and double are not as precise as decimal. decimal is recommended for money related number input.(currency and salary).
Another point need to point out is: Do NOT compare float number using "=","<>", because float numbers are not precise.
Linger: The website you mention and quote has IMO some imprecise info that made me confused. In the docs I read that when you declare a float or a double, the decimal point is in fact NOT included in the number. So it is not the number of chars in a string but all digits used.
Compare the docs:
"DOUBLE PRECISION(M,D).. Here, “(M,D)” means than values can be stored with up to M digits in total, of which D digits may be after the decimal point. For example, a column defined as FLOAT(7,4) will look like -999.9999 when displayed"
http://dev.mysql.com/doc/refman/5.1/en/floating-point-types.html
Also the nomenclature in misleading - acc to docs: M is 'precision' and D is 'scale', whereas the website takes 'scale' for 'precision'.
Thought it would be useful in case sb like me was trying to get a picture.
Correct me if I'm wrong, hope I haven't read some outdated docs:)
Float and Double are Floating point data types, which means that the numbers they store can be precise up to a certain number of digits only.
For example for a table with a column of float type if you store 7.6543219 it will be stored as 7.65432.
Similarly the Double data type approximates values but it has more precision than Float.
When creating a table with a column of Decimal data type, you specify the total number of digits and number of digits after decimal to store, and if the number you store is within the range you specified it will be stored exactly.
When you want to store exact values, Decimal is the way to go, it is what is known as a fixed data type.
Simply use FLOAT. And do not tack on '(m,n)'. Do display numbers to a suitable precision with formatting options. Do not expect to get correct answers with "="; for example, float_col = 0.12 will always return FALSE.
For display purposes, use formatting to round the results as needed.
Percentages, averages, etc are all rounded (at least in some cases). That any choice you make will sometimes have issues.
Use DECIMAL(m,n) for currency; use ...INT for whole numbers; use DOUBLE for scientific stuff that needs more than 7 digits of precision; use FLOAT` for everything else.
Transcendentals (such as the LOG10 that you mentioned) will do their work in DOUBLE; they will essentially never be exact. It is OK to feed it a FLOAT arg and store the result in FLOAT.
This Answer applies not just to MySQL, but to essentially any database or programming language. (The details may vary.)
PS: (m,n) has been removed from FLOAT and DOUBLE. It only added extra rounding and other things that were essentially no benefit.

negative integers in binary

5 (decimal) in binary 00000101
-5 (two's complement) in binary 11111011
but 11111011 is also 251 (decimal)!
How does computer discern one from another??
How does it know whether it's -5 or 251??
it's THE SAME 11111011
Thanks in advance!!
Signed bytes have a maximum of 127.
Unsigned bytes cannot be negative.
The compiler knows whether the variable holding that value is of signed or unsigend type, and treats it appropriately.
If your program chooses to treat the byte as signed, the run-time system decides whether the byte is to be considered positive or negative according to the high-order bit. A 1 in that high-order bit (bit 7, counting from the low-order bit 0) means the number is negative; a 0 in that bit position means the number is positive. So, in the case of 11111011, bit 7 is set to 1 and the number is treated, accordingly, as negative.
Because the sign bit takes up one bit position, the absolute magnitude of the number can range from 0 to 127, as was said before.
If your program chooses to treat the byte as unsigned, on the other hand, what would have been the sign bit is included in the magnitude, which can then range from 0 to 255.
Two's complement is designed to allow signed numbers to be added/substracted to one another in the same way unsigned numbers are. So there are only two cases where the signed-ness of numbers affect the computer at low level.
when there are overflows
when you are performing operations on mixed: one signed, one unsigned
Different processors take different tacks for this. WRT orverflows, the MIPS RISC architecture, for example, deals with overflows using traps. See http://en.wikipedia.org/wiki/MIPS_architecture#MIPS_I_instruction_formats
To the best of my knowledge, mixing signed and unsigned needs to avoided at a program level.
If you're asking "how does the program know how to interpret the value" - in general it's because you've told the compiler the "type" of the variable you assigned the value to. The program doesn't actually care if 00000101 as "5 decimal", it just has an unsigned integer with value 00000101 that it can perform operations legal for unsigned integers upon, and will behave in a given manner if you try to compare with or cast to a different "type" of variable.
At the end of the day everything in programming comes down to binary - all data (strings, numbers, images, sounds etc etc) and the compiled code just ends up as a large binary blob.

The difference between signed and unsigned in MySQL?

What is the difference between signed and unsigned in MySQL? And what does signed and unsigned mean?
Unsigned numbers do not have the minus sign. Unsigned number can be only positive or zero (e.g. 123, 0). Signed numbers can be also negative (e.g. -42).
This answer explains the difference throughly.
The range that you can store in a given space. E.g., quoting from the docs:
TINYINT[(M)] [UNSIGNED] [ZEROFILL]
A very small integer. The signed range
is -128 to 127. The unsigned range is
0 to 255.
and similarly of course for other larger integer types.
Range of possible values, as seen on this table.
It's not specific to MySQL, it is a consequence of how integers are represented in a computer. Sign takes one bit for itself, thus the maximum number gets (roughly) halved. You can also think of it like shifting the whole thing by half the range downwards. (Also, because there's an even number of available numbers and there aren't two zeroes, you get one more negative number than positive). If you want to know more, read up on two's complement.

What is an integer overflow error?

What is an integer overflow error?
Why do i care about such an error?
What are some methods of avoiding or preventing it?
Integer overflow occurs when you try to express a number that is larger than the largest number the integer type can handle.
If you try to express the number 300 in one byte, you have an integer overflow (maximum is 255). 100,000 in two bytes is also an integer overflow (65,535 is the maximum).
You need to care about it because mathematical operations won't behave as you expect. A + B doesn't actually equal the sum of A and B if you have an integer overflow.
You avoid it by not creating the condition in the first place (usually either by choosing your integer type to be large enough that you won't overflow, or by limiting user input so that an overflow doesn't occur).
The easiest way to explain it is with a trivial example. Imagine we have a 4 bit unsigned integer. 0 would be 0000 and 1111 would be 15. So if you increment 15 instead of getting 16 you'll circle back around to 0000 as 16 is actually 10000 and we can not represent that with less than 5 bits. Ergo overflow...
In practice the numbers are much bigger and it circles to a large negative number on overflow if the int is signed but the above is basically what happens.
Another way of looking at it is to consider it as largely the same thing that happens when the odometer in your car rolls over to zero again after hitting 999999 km/mi.
When you store an integer in memory, the computer stores it as a series of bytes. These can be represented as a series of ones and zeros.
For example, zero will be represented as 00000000 (8 bit integers), and often, 127 will be represented as 01111111. If you add one to 127, this would "flip" the bits, and swap it to 10000000, but in a standard two's compliment representation, this is actually used to represent -128. This "overflows" the value.
With unsigned numbers, the same thing happens: 255 (11111111) plus 1 would become 100000000, but since there are only 8 "bits", this ends up as 00000000, which is 0.
You can avoid this by doing proper range checking for your correct integer size, or using a language that does proper exception handling for you.
An integer overflow error occurs when an operation makes an integer value greater than its maximum.
For example, if the maximum value you can have is 100000, and your current value is 99999, then adding 2 will make it 'overflow'.
You should care about integer overflows because data can be changed or lost inadvertantly, and can avoid them with either a larger integer type (see long int in most languages) or with a scheme that converts long strings of digits to very large integers.
Overflow is when the result of an arithmetic operation doesn't fit in the data type of the operation. You can have overflow with a byte-sized unsigned integer if you add 255 + 1, because the result (256) does not fit in the 8 bits of a byte.
You can have overflow with a floating point number if the result of a floating point operation is too large to represent in the floating point data type's exponent or mantissa.
You can also have underflow with floating point types when the result of a floating point operation is too small to represent in the given floating point data type. For example, if the floating point data type can handle exponents in the range of -100 to +100, and you square a value with an exponent of -80, the result will have an exponent around -160, which won't fit in the given floating point data type.
You need to be concerned about overflows and underflows in your code because it can be a silent killer: your code produces incorrect results but might not signal an error.
Whether you can safely ignore overflows depends a great deal on the nature of your program - rendering screen pixels from 3D data has a much greater tolerance for numerical errors than say, financial calculations.
Overflow checking is often turned off in default compiler settings. Why? Because the additional code to check for overflow after every operation takes time and space, which can degrade the runtime performance of your code.
Do yourself a favor and at least develop and test your code with overflow checking turned on.
From wikipedia:
In computer programming, an integer
overflow occurs when an arithmetic
operation attempts to create a numeric
value that is larger than can be
represented within the available
storage space. For instance, adding 1 to the largest value that can be represented
constitutes an integer overflow. The
most common result in these cases is
for the least significant
representable bits of the result to be
stored (the result is said to wrap).
You should care about it especially when choosing the appropriate data types for your program or you might get very subtle bugs.
From http://www.first.org/conference/2006/papers/seacord-robert-slides.pdf :
An integer overflow occurs when an integer is
increased beyond its maximum value or
decreased beyond its minimum value.
Overflows can be signed or unsigned.
P.S.: The PDF has detailed explanation on overflows and other integer error conditions, and also how to tackle/avoid them.
I'd like to be a bit contrarian to all the other answers so far, which somehow accept crappy broken math as a given. The question is tagged language-agnostic and in a vast number of languages, integers simply never overflow, so here's my kind-of sarcastic answer:
What is an integer overflow error?
An obsolete artifact from the dark ages of computing.
why do i care about it?
You don't.
how can it be avoided?
Use a modern programming language in which integers don't overflow. (Lisp, Scheme, Smalltalk, Self, Ruby, Newspeak, Ioke, Haskell, take your pick ...)
I find showing the Two’s Complement representation on a disc very helpful.
Here is a representation for 4-bit integers. The maximum value is 2^3-1 = 7.
For 32 bit integers, we will see the maximum value is 2^31-1.
When we add 1 to 2^31-1 : Clockwise we move by one and it is clearly -2^31 which is called integer overflow
Ref : https://courses.cs.washington.edu/courses/cse351/17wi/sections/03/CSE351-S03-2cfp_17wi.pdf
This happens when you attempt to use an integer for a value that is higher than the internal structure of the integer can support due to the number of bytes used. For example, if the maximum integer size is 2,147,483,647 and you attempt to store 3,000,000,000 you will get an integer overflow error.