get count of leading zeros in mysql - mysql

Is it possible to create a select query that computes the number of leading zeros from a bit-operation in mysql? I would need to compare this to a threshold and return the results.
The Select-Query would be called very often, but i think that every SQL-based solution is better than our current strategy of loading everything into memory and then doing it in java.
example of the functionality:
unsigned INT with the value 1 -> 31, since there 32 bits and the rightmost is set
unsigned INT with the value 0x0C000000 -> 4, there are 4 zero bits from the highest order to the first set
I can then compare the result to a threshold an only get the ones above the threshold.
Pseudocode example query:
SELECT *
FROM data
WHERE numberOfLeadingZeros(data.data XOR parameter) >= threshold;

This should work, but please take a look at #viraptor's suggestion:
SELECT * FROM data WHERE (32 - LENGTH(BIN(data))) >= threshold
This just converts the integer into a binary string without the leading zeros. So if you subtract the length of this string from 32 you got the amount of leading zeros in the number.
Fiddle: http://sqlfiddle.com/#!9/e56c31/2

You can ignore the counting of zeros, and just compare to a threshold. For example you know how many leading 0s are in 32-bit value 2. Anything with fewer 0s will be >2 and anything with more 0s will be <2.
So for your solution, instead of doing query number_of_zeros(x) >= threshold do a query for x < value_with_threshold_leading_zeros. (the value being 2^(31-number_of_zeros))

Related

Why is MySQL rounding division of 2 decimal to an unexcepted value?

I have a table with the following data:
Area VARCHAR(50),
Revenue DECIMAL(20,2),
Expense DECIMAL(20,2),
PercentBilled DECIMAL(6,2)
These values were imported from a spreadsheet and the percent billed is not precise enough. It's rounded to 2 decimal places. I can calculate it by taking Revenue / Expense, but I'm not getting the value I'm expecting.
Select * from would return 'Counter', -1822.90, 2749.63, 0.66
-1822.90 / 2749.63 = -0.6629619.... which the absolute value would round to 66.3%, which is the precision I need.
So why then, when I run the following query:
Select Area, (Revenue / Expense) AS calcPercentBilled, PercentBilled from
I get: 'Counter, -1822.90, 2749.63, 1.000000, 0.66 ?
My guess is it's something funky with MySQL and types, but I can't figure out what's happening. Why is MySQL rounding the division of 2 decimal numbers in a query to 1.000000?
The problem is the data type you have specified for PercentBilled.
DECIMAL(6,2) means store the value with 6 digits with 2 of them after the decimal point. So to increase the precision of the value stored you'll need to change the type of the column, perhaps to DECIMAL(9,6). This would allow the value in your example to be stored as "0.662962" (i.e. six decimal places).
Note that simply updating the type of the column will not magic up the missing precision unless you re-import your data from source. Simply changing the data type without re-loading the data will change it to "0.660000".

How do you round floats conditionally?

I am writing a query that is used by report generating software.
Part of this is querying for the hours needed to complete a project. We record this a 2 decimal float so that we can estimate to the quarter hour.
However, if we are using it in our report and the hour we recorded is something like 8.00, I want to query it and format it so that 8.00 is just 8. However any hours with something past the decimal, like 8.25, should remain as 8.25. How can I make this work?
hours Queried Result
====== -> My Query -> ==============
8.00 8
8.25 8.25
I am using MySQL 5.6
You can use the REPLACE() function to remove .00:
REPLACE(hours, '.00', '') AS hours
You can convert it to a string and check the rightmost 2 characters and trim those if they are '00'.
SELECT TRIM(TRAILING '.00' FROM CAST(column_name AS VARCHAR));
SELECT REPLACE(Round(8.00), '.00', ' ');
I will give more example so you can clear your Logic:
MySQL ROUND() rounds a number specified as an argument up to a number specified as another argument.
Syntax:
ROUND(N,[D]);
Where 'N' is rounded up to D decimal places.
and 'D' is indicating up to how many decimal places N will be rounded.
Example 1:-
SELECT ROUND(4.43);
Output :-
4
The above MySQL statement will round the given number 4.43. No decimal places have been defined, so the default decimal value is 0.
Example 2:-
SELECT ROUND(-4.53);
Output:-
-5
The above MySQL statement will round the given number -4.53. No decimal places have been defined, so the default decimal value is 0.

Column type for saving very different number of decimals

I need to store numbers like
21000
1.0002
0.00230235
12323235
0.2349523
This is sensordata so it is important to keep the exact value.
THere are many options.
My solution would be to multiply all values by 1 million, and store them as a bigint. Would that make sense?
That makes sense but I'd recommend that you just use the decimal datatype: https://dev.mysql.com/doc/refman/5.7/en/precision-math-decimal-characteristics.html
If you were to multiply by million and if a dataset you receive has one more decimal than you'd expect, you'd end up multiplying that number by 10 million and all other numbers by 10. Instead, using the decimal datatype will give you 30 numbers to the right of the decimal.
The declaration syntax for a DECIMAL column is DECIMAL(M,D). The
ranges of values for the arguments in MySQL 5.7 are as follows:
M is the maximum number of digits (the precision). It has a range of 1
to 65.
D is the number of digits to the right of the decimal point (the
scale). It has a range of 0 to 30 and must be no larger than M.
and
The SQL standard requires that the precision of NUMERIC(M,D) be
exactly M digits. For DECIMAL(M,D), the standard requires a precision
of at least M digits but permits more. In MySQL, DECIMAL(M,D) and
NUMERIC(M,D) are the same, and both have a precision of exactly M
digits.
For a full explanation of the internal format of DECIMAL values, see
the file strings/decimal.c in a MySQL source distribution. The format
is explained (with an example) in the decimal2bin() function.
To format your numbers, you could do formatting like this answer describes: Format number to 2 decimal places
Example
create table test (
price decimal(40,20)
);
-- all the above insertions will succeed cleanly
insert into test values (1.5), (1.66), (1.777), (1.12345678901234567890);
-- notice we have 21 digits after decimal
-- MySQL will insert data with 20 decimal and add a warning regarding data truncation
insert into test values (1.123456789012345678901);
Data
select * from test
price
1.50000000000000000000
1.66000000000000000000
1.77700000000000000000
1.12345678901234567890
1.12345678901234567890
select cast(price as decimal(40,2)) from test
price
1.50
1.66
1.78
1.12
1.12

Turn off scientific notation MySQL

When I insert a DOUBLE, why do I see a value like 9.755046187483832e17 when I select that value? How can I retrieve a number like 975504618748383289 instead?
You're probably looking for the FORMAT or ROUND function:
Using FORMAT(), depending on your locale and your specific needs, you might have to replace the thousands-separator:
mysql> SELECT FORMAT(9.755046187483832e17,0);
975,504,618,748,383,200
mysql> SELECT REPLACE(FORMAT(9.755046187483832e17,0), ',','');
975504618748383200
On the other hand, ROUND() being a numeric function, it only outputs digits:
mysql> SELECT ROUND(9.755046187483832e17,0);
975504618748383200
See http://sqlfiddle.com/#!2/d41d8/17614 for playing with that.
EDIT: As you noticed, the last two digits are rounded to 00. That's because of DOUBLE precision limits. You have to remember that double are approximate. If you need precise values and/or more digits than available with the 16-bits precision of double, you probably needs to change your column's type to DECIMAL. By default DECIMAL has 10 digits precision (10 base 10 digits). You could explicitly request up to 65 digits.
For example, if you need up to 20 digits precision, you write something like that:
CREATE TABLE tbl (myValue DECIMAL(20), ...
See http://dev.mysql.com/doc/refman/5.6/en/fixed-point-types.html
Please note however than things are not that simple. Selecting the decimal column might silently convert it to double (or bigint ?) thus loosing the extra precision. You might have to explicitly cast to string in order to preserve the full precision. That means the you might have to deal with that at application level.
create table tbl (dblValue DOUBLE, decValue DECIMAL(20,0));
insert into tbl values (975504618748383289, 975504618748383289);
SELECT dblValue, decValue FROM tbl;
--> DBLVALUE DECVALUE
--> 975504618748383200 975504618748383200
SELECT CAST(dblValue AS CHAR), CAST(decValue AS CHAR) FROM tbl;
--> CAST(DBLVALUE AS CHAR) CAST(DECVALUE AS CHAR)
--> 9.755046187483832e17 975504618748383289
See http://sqlfiddle.com/#!2/d5f58/2 for examples.
The double has a precision of about 16 digits. If you need more precision you have two options.
If the value is an integer, you can use bigint up to about 19 digits of precision.
Better is decimal which supports up to 65 digits. For instance, the following returns an error:
select format(v*v*v*v*v*v*v*v, 0)
from (select 1001 as v) t
Because the value of v is treated as a bigint.
However, the following works very nicely:
select format(v*v*v*v*v*v*v*v, 0)
from (select cast(1001 as decimal(65, 0)) as v) t
Returning 1,008,028,056,070,056,028,008,001 -- which is the precise answer.
If you need precision up to 65 places, then use decimal. Beyond that, you may have to write your own routines. If you are not doing arithmetic on the field, then consider storing it as a string.

MySQL: limit results by a calculated step interval

I have a need to return a specific number of rows from a query within a given start and stop time at a dynamically calculated step interval.
I've kept it simple here with a table consisting of a unix timestamp and a corresponding integer value.
In my example, I need to have 200 rows returned with an INCLUSIVE start time of 1307455099 and and an INCLUSIVE end time of 1307462455.
Here's the current query I've developed so far. It uses the modulus of the total rows to calculate the step interval:
SELECT timestamp, value FROM soh_data
WHERE timestamp % (CAST((1307462455 - 1307455099)/200 AS SIGNED INTEGER)) = 0
AND timestamp BETWEEN 1307455099 AND 1307462455
ORDER BY timestamp;
The first problem is that because I'm using a modulus, the start and end times aren't always inclusive (that's solvable with an extra query... I'm fine with that).
The second, and more difficult issue to tackle, is that the total rows returned in this case is only 196. In most queries, it's n-1.
FYI, this is on a MySQL database with millions of rows of data.
Any insights?
Since I'm fine with throwing away a few rows, but I'm not alright with too little data, I've come up with two different approaches.
First: I've decided to adapt my query to use FLOOR instead of CAST. In my example, the quotient of the division was 21.805. SQL rounded that up to 22. The right step interval for gathering more than 200 results was 21 (yielding 205 results). Using FLOOR will give me the step number of 21 I need. Unfortunately, I haven't fully tested this to ensure consistent results across larger sets:
SELECT DISTINCT timestamp FROM soh_data
WHERE timestamp % (FLOOR((1307459460 - 1307455099)/200)) = 0
AND timestamp BETWEEN 1307455099 AND 1307459460
ORDER BY timestamp;
The more reliable solution is to pre-calculate the step in code. This way, I can zero in on the step programmatically. In the following example, I use Ruby for readability, but my ultimate solution will be coded in C++:
lower = 1307455099
upper = 1307459460
limit = 200
range = lower..upper
matches = 0
stepFactor = ((upper-1) - (lower+1))/limit
while (matches <= (limit - 2)) do
matches = 0
range.each { |ts| matches += 1 if (ts % stepFactor == 0) }
stepFactor -= 1 # For the next attempt
puts "Step factor = #{stepFactor+1}"
puts "Matches = #{matches}"
end
The number of rows returned would depend entirely on how many timestamps match your condition, of course. Let's say your step value comes out to 2, so your modulo math boils down to 'only even numbered timestamps'. If by chance all items in your table have odd time stamps, then you're going to get 0 rows returned, even though there's (say) 500+ items within the time range.
If you need exactly 200, you'd probably be better off using LIMIT in some way.