SQL sum round numbers and count decimal - mysql

How to sum integers as they are and treat floating numbers as 1.
From the table given below the expected result is:
1 + 1 + 1 + 5 = 8
colum1 colum2
aa 1
bb 0.5
cc 3.66
dd 5

You can compare each number to its floored value to check if it is a decimal or not, and then use a case expression to treat decimals as 1:
SELECT CAST(SUM(CASE number WHEN FLOOR(number) THEN number ELSE 1 END) AS INTEGER)
FROM mytable;

Related

Sql query to count values after a particular condition is met

I have a table,
Name Seconds Status_measure
a 0 10
a 10 13
a 20 -1
a 30 15
a 40 20
a 50 12
a 60 -1
Here I want for a particular name a new column which is calculated by, "The number of times the value goes >-1 only after once the -1 is met" . So in this particular data I want a new column for the name "a" which has the value=3 , because once the -1 is reached in Status_measure, we have 3 values (15 and 20 and 12)>-1
Required data frame:
Id Name Seconds Status_measure Value
1 a 0 10 3
2 a 10 13 3
3 a 20 -1 3
4 a 30 15 3
5 a 40 20 3
6 a 50 12 3
7 a 60 -1 3
I tried doing
count(status_measure>-1) over (partition by name order by seconds)
But this is not giving any desired result
You can do it in 2 steps, group data, count entries of the grp = 1.
select *, sum(Status_measure > -1 and grp = 1) over(partition by name) n
from (
select *
, row_number() over(partition by name order by Seconds) - sum(Status_measure > -1 ) over(partition by name order by Seconds) grp
from tbl
) t
An option is using a variable update, which:
starts from 0
increases its value when reaches a -1
decreases its value when reaches a second -1
Once you have this column, you can run a sum over your values.
SET #change = 0;
SELECT *, SUM(CASE WHEN Status_measure = -1
THEN IF(#change=0, #change := #change + 1, #change := #change - 1)
ELSE #change END) OVER() -1 AS Value_
FROM tab
Check the demo here.
Limitations: this solution assumes you have only one range of interesting values between -1s.
Note: there's a -1 decrement from your sum because the first update of the variable will leave 1 in the same row of -1, which you don't want. For better understanding, comment out the application of SUM() OVER and see intermediate output.
More of a clarification to your question first. I want to expand your original data to include another row for the sake of 2 vs 3 entries. Also, is there some auto-increment ID in your data that the sequential consideration is applicable such as
Id Name Seconds Status_measure Value
1 a 0 10 3
2 a 10 13 3
3 a 20 -1 3
4 a 30 15 3
5 a 40 20 3
6 a 50 12 3
7 a 60 -1 3
If sequential, and you have IDs 1 & 2 above the -1 at ID #3. This would indicate two entries. But then for IDs 4-6 above -1 have a count of three entries before ID #7.
So, what "VALUE" do you want to have in your result. The max count of 3 for all rows, or would it be a value of 2 for ID#s 1, 2 and 3? And value of 3 for Ids 4-7? Or, do you want ALL entries to recognize the greatest count before -1 measure to show 3 for all entries.
Please EDIT your question, you can copy/paste this in your original question if need be and provide additional clarification as requested (auto-increment as well as that is an impact of final output / determining break).

Get the average of values in every specific epoch ranges in unix timestamp which returns -1 in specific condition in MySQL

I have a MySQL table which has some records as follows:
unix_timestamp value
1001 2
1003 3
1012 1
1025 5
1040 0
1101 3
1105 4
1130 0
...
I want to compute the average for every 10 epochs to see the following results:
unix_timestamp_range avg_value
1001-1010 2.5
1011-1020 1
1021-1030 5
1031-1040 0
1041-1050 -1
1051-1060 -1
1061-1070 -1
1071-1080 -1
1081-1090 -1
1091-1100 -1
1101-1110 3.5
1111-1120 -1
1121-1130 0
...
I saw some similar answers like enter link description here and enter link description here and enter link description here but these answers are not a solution for my specific question. How can I get the above results?
The easiest way to do this is to use a calendar table. Consider this approach:
SELECT
CONCAT(CAST(cal.ts AS CHAR(50)), '-', CAST(cal.ts + 9 AS CHAR(50))) AS unix_timestamp_range,
CASE WHEN COUNT(t.value) > 0 THEN AVG(t.value) ELSE -1 END AS avg_value
FROM
(
SELECT 1001 AS ts UNION ALL
SELECT 1011 UNION ALL
SELECT 1021 UNION ALL
...
) cal
LEFT JOIN yourTable t
ON t.unix_timestamp BETWEEN cal.ts AND cal.ts + 9
GROUP BY
cal.ts
ORDER BY
cal.ts;
In practice, if you have the need to do this sort of query often, instead of the inline subquery labelled as cal above, you might want to have a full dedicated table representing all timestamp ranges.

Why does count(distinct ..) return different values on the same table?

select count(distinct a,b,c,d) from mytable;
select count(distinct concat(a,'-',b),concat(c,'-',d)) from mytable;
Since '-' never appears in a,b,c,d fields, the 2 queries above should return the same result. Am I right ?
Actually it is not the case, the difference is 4 rows out of ~60M and I cant figure out how this is possible
Any idea or example ?
Thanks
First, I am assuming that you are using MySQL, because that is the only database of your original tags where your syntax would be accepted.
Second, this does not directly answer your question. Given your types and expressions, I do not see how you can get different results. However, very similar constructs can produce different results.
It is very important to note that NULL is not the culprit. If any argument is NULL for either COUNT(DISTINCT) or CONCAT(), then the result is NULL -- and NULLs are not counted.
However, spaces at the end of strings can be an issue. Consider the results from this query:
select count(distinct x, y),
count(distinct concat(x, '-', y)),
count(distinct concat(y, '-', x))
from (select 1 as x, 'a' as y union all
select 1, 'a ' union all
select 1, NULL
) a
I would expect the second and third arguments to return the same thing. But spaces at the end of the string cause differences. COUNT(DISTINCT) ignores them. However, CONCAT() will embed them in the string. Hence, the above returns
1 1 2
And the two values are different.
In other words, two values may not be exactly the same, but COUNT(DISTINCT) might regard them as the same. Spaces are one example. Collations are another potential culprit.
Take example of sample data
A B C D
1 2 3 4
5 6 7 8
1 2 5 7
1 2 5 7
1 3 3 4
1 3 3 4
then count (distinct (a, b, c, d)) = 4
A B C D
1 2 3 4
5 6 7 8
1 2 5 7
1 3 3 4
and count (distinct (a,-,b), distinct (c,-,d)) = 3
dist (a,-,b) dist (c,-,d)
1 2 3 4
5 6 7 8
1 3 5 7

Custom number sequence formatting

The system I am working with has a numbering system where the numbers 0-999 are represented by the usual 0-999, but 1000 is represented by A00, followed by A01, A02, A03, etc, 1100 being B00 etc.
I can't think of a way to handle this in T-SQL without resorting to inspecting individual digits with huge case statements, and there must be a better way than that. I had thought about using Hexadecimal but that's not right.
DECLARE #startint int = 1,
#endint int = 9999;
;WITH numbers(num)
AS
(
SELECT #startint AS num
UNION ALL SELECT num+1 FROM numbers
WHERE num+1 <= #endint
)
SELECT num, convert(varbinary(8), num) FROM [numbers] N
OPTION(MAXRECURSION 0)
With this 999 is now 3E7, where it should just be 999.
This currently produces this:
Number Sequence
0 0x00000000
1 0x00000001
...
10 0x0000000A
...
100 0x00000064
...
999 0x000003E7
1000 0x000003E8
What I'm looking for:
Number Sequence
0 000
1 001
...
10 010
11 011
12 012
...
999 999
1000 A00
1001 A01
...
1099 A99
1100 B00
1101 B01
1200 C00
I need this to work in SQL Server 2008.
You can use integer division and modulo to separate the hundreds part from the tens.
After that, you can add 64 to the quotient to get an ASCII value starting from A.
create function function dbo.fn_NumToThreeLetters(#num integer)
RETURNS nchar(3)
AS
begin
RETURN (SELECT (case
when #num/1000 >0 then
CHAR(( (#num-900)/100) +64)
+ replace(cast( #num %100 as nchar(2)),' ','0')
else cast(#num as nvarchar(3))
end)
)
END
select dbo.fn_NumToThreeLetters(1100)
-------
B00
select dbo.fn_NumToThreeLetters(999)
-------
999
The first when clause ensures that the conversion is applied only if a number is above 1000. If it is, subtract 900 then divide by 100, so we get a number that starts from 1 for 1000, 2 for 1100, etc.
Add 64 to it to get an ASCII starting from A and convert it back to a character with CHAR.
The remainder just needs to be converted to a 2-digit nchar, where spaces are replaced with 0.
This will work only up to 3500. The question doesn't specify what should be done with larger numbers

Average of rows Mysql ignore zero values

Hello I have a problem with the function avg. I have a table like this and I would like to take the average of each row. I also have the zero in some cells and would like to avoid count them.
data rep val1 val2 val3
1 a 0 3 3
2 a 1 4 0
3 a 1 1 1
4 a 1 3 0
And I would like this result
data AVG
1 3
2 2.5
3 1
4 2
thank you
Assuming you have at least one non-zero value:
SELECT data, (val1+val2+val3)/((val1!=0) + (val2!=0) + (val3!=0)) avg
FROM **table_name**
I think divide by zero returns null see manual, depending on your db settings, so you could do:
SELECT data, COALESCE((val1+val2+val3)/((val1!=0) + (val2!=0) + (val3!=0)),0) avg
FROM **table_name**
Any null values in a row will cause each query to always return null and 0 for the row respectively.