SQL Fastest way to Group records based on Thresholds - sql-server-2008

My records are in a temporary table having three columns :
Column1 : ID (Bigint)
Column2 : CreationDateTime (dateTime)
Column3 : Volume (Float)
The records are sorted based on CreationDateTime.
I need to pick the records from the table where Sum of Volume is equal to THRESHOLD1 and then the same for Threshold2.
One way is to add a new Column to the table which has the sum of Volume for the previous records. for example :
ID - CreationDateTime - Volume - SUM
1 - 20/07/2012 - 10 - 10
2 - 21/07/2012 - 12 - 22
3 - 22/07/2012 - 7 - 29
and then Select * from temp where Sum >= Threshold But the calculation of the sum is not the fastest way.
I was wondering if anyone can suggest a better way for doing the above.
I'm using SQL server 2008 and I can also use CLR if required.

try this solution:
you can find the running total just by self joining tables and group by
with cte as(
select T2.ID, T2.CreationDateTime,SUM(T1.Volume) [SUM]
from test_table T1 join test_table T2
on T1.id<=T2.id
group by T2.id, T2.CreationDateTime)
select * from cte where [SUM]>= Threshold

Here's an approach using a recursive CTE, which will likely be the fastest:
select #i=min(ID) from #temp
;with a as
(
select ID, Volume, Volume as RunningTotal
from #temp
where ID=#i
union all
select b.ID, b.Volume, b.Volume + a.RunningTotal as RunningTotal
from #temp b
inner join a
on b.ID=a.ID+1
)
select * from a
Some links related to running totals:
http://www.sqlusa.com/bestpractices/runningtotal/
http://www.databasejournal.com/features/mssql/article.php/3112381/SQL-Server-Calculating-Running-Totals-Subtotals-and-Grand-Total-Without-a-Cursor.htm
http://www.mssqltips.com/sqlservertip/1686/calculate-running-totals-using-sql-server-cross-joins/
http://social.msdn.microsoft.com/Forums/eu/transactsql/thread/1b4d87cb-ec77-4455-af48-bf7dae50ab87
Computed Column using a function:
create function dbo.fn_VolumeRunningTotal
{
#dt datetime
}
returns int
as
begin
declare #total int
select #total = sum(volume)
from dbo.MyVolumeTable
where CreationDateTime <= #dt
return #total
end
Computed Column formula:
dbo.fn_VolumeRunningTotal(CreationDateTime)
Select statements:
select * from dbo.MyVolumnTable where RunningTotal <= #Threshold1

Related

How to fix Number Of Row Per Page in SSRS Report?

I Have Fix 10 Rows Per Page , If 2records Comes From Query Then I Want To Show 8 Blank Rows. How To Fix It ?
This is a fairly generic example. It just counts the actual rows, calculates how many rows are required to round up to the nearest 10 and then UNIONs a query that generates blank rows.
SELECT *
FROM (
SELECT
RowN = ROW_NUMBER() OVER(ORDER BY myColumn), -- order by any column
*
FROM myTable
UNION ALL
SELECT
TOP (SELECT ExtraRows = (FLOOR((Count(*)+9)/10) * 10) - COUNT(*) FROM myTable) -- 10 here is rows per page
NewRowNumber = ROW_NUMBER() OVER (ORDER BY [object_id]) + (SELECT COUNT(*) FROM myTable)
,NULL, NULL, NULL -- one nulll value for each column in myTable
FROM sys.all_columns
) u
ORDER by u.RowN -- add any additional required sorting here
If your current query is not simple then dump the results of that into a temp table
SELECT *
INTO #t
FROM ...
myBigQuery
then change the references to myTable in the main query above to #t or whatever the temp table is called.
EDIT for using with SP
If using a Stored proc then you can dump the results of that into a temp table and do the same. For exmaple
CREATE TABLE #t (ColA int, ColB varchar(100)....)
INSERT INTO #t
EXEC myStoredProc
...
the main query from above
...
Just swap out all references to myTable with #t

Calculate average per day based on the difference of the values

I have a table:
value
updated_at
ID
5
2022-1-1 12:00:00
1
10
2022-1-1 12:00:30
2
20
2022-1-1 12:02:30
3
What I want to do is to get an average based on the updated_at column difference, and the values of course.
So, I guess the formula should be:
(sumof((value2 - value1) * (date2 - date1))) / (dateLast - dateFirst) where 1 and 2 means for each two rows when we traverse from the first to the last item. eg for this table we'll have:
First and second row: (value2 - value1) * (date2 - date1) = (10 - 5) * (30 (seconds)) = 150
for second and third row: (20 - 10) * 120 = 1200
So the result is:
(1200 + 150) / (2022-1-1 12:02:30 - 2022-1-1 12:00:00) = 9
I probably can get this working with a self JOIN on ID and ID + 1 and I also can do the diff of last and first date, but I can't do them both in the same query! I have no idea how to do that, is this even possible to be done in a single query?
Update
My MySql version is 5.6
For MySql 8.0+ you can use LAG() window function to get each row's previous values and then aggregate:
WITH cte AS (
SELECT *,
value - LAG(value) OVER (ORDER BY updated_at) dif_value,
UNIX_TIMESTAMP(updated_at) - UNIX_TIMESTAMP(LAG(updated_at) OVER (ORDER BY updated_at)) dif_time
FROM tablename
)
SELECT SUM(dif_value * dif_time) /
(UNIX_TIMESTAMP(MAX(updated_at)) - UNIX_TIMESTAMP(MIN(updated_at))) result
FROM cte;
For previous versions and if there are no gaps between the ids, use a self join:
SELECT SUM(dif_value * dif_time) /
(UNIX_TIMESTAMP(MAX(updated_at)) - UNIX_TIMESTAMP(MIN(updated_at))) result
FROM (
SELECT t1.*,
t1.value - t2.value dif_value,
UNIX_TIMESTAMP(t1.updated_at) - UNIX_TIMESTAMP(t2.updated_at) dif_time
FROM tablename t1 LEFT JOIN tablename t2
ON t1.ID = t2.ID + 1
) t;
See the demo.

Passing multiple values for a single parameter from SSRS to hive

Creating a concatenated string in SSRS with values enclosed in single quotes
Any answers to the above question?. I am struck with the same problem:
The query from SSRS side is:
select *
from xyz.test_table1
where f1 in (?)
Datasource for me in this case is a hive table. User selection on the parameter is a multivalued parameter which is what I expect to be substituted as:
where in ('value1','value2')
when query is executed. But when looked at the query execution on the hive side, it comes as:
where in ('value1,value2')
How could I solve this?
From the documentation here, it seems Hive Query Language supports Common Table Expressions.
Consequently, something similar to the following should work:
declare #str nvarchar(4000) = ?; -- String to split.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = ',')
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(',',isnull(#str,''),s),0)-s,4000) from s)
-- Return each individual value in the delimited string along with it's position.
,v as (select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
)
select *
from v
join xyz.test_table1 as t
on v.v = t.f1
If you rather understandably don't want this rigamarole in all of your datasets, you would need to encapsulate this logic into whatever the Hive equivalent of a SQL Server table-valued parameter is, perhaps a UDTF?
In SQL Server, the function would be defined as follows:
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value to return.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
Figured it out! Posting the answer for other users.
Provide the query(under Query in SSRS) as an expression like below:
="select * from xyz.test_table1 where f1 in ('"&Join(Parameters!param.Value,"','")&"')"
The above string manipulation translates to:
select * from xyz.test_table1 where f1 in ('value1','value2')
Note: value1, value2 here are the values from user selected multivalue parameter

repeat result multiple times in mysql

I have a table having id and no field, what I really want is the result raw will be repeated no filed times, if the no field is 2 then that raw must be repeated twice in result.
this is my sample table structure:
id no
1 3
2 2
3 1
now I need to get a result like:
1 3
1 3
1 3
2 2
2 2
3 1
I tried to write mysql query to get the result like above, but failed.
You need a table of numbers to accomplish this. For just three values, this is easy:
select t.id, t.no
from t join
(select 1 as n union all select 2 union all select 3
) n
on t.no <= n.no;
This query must do what you want to achieve:
select t.id, t.no from test t cross join test y where t.id>=y.id
not completely solve your problem, but this one can help
set #i=0;
select
test_table.*
from
test_table
join
(select
#i:=#i+1 as i
from
any_table_with_number_of_rows_greater_than_max_no_of_test_table
where
#i < (select max(no) from test_table)) tmp on no >= i
order by
id desc
EDIT :
This is on SQL Server. I checked online and see that CTEs work on MySQL too. Just couldn't get them to work on SQLFiddle
Try this, remove unwanted columns
create table #temp (id int, no int)
insert into #temp values (1, 2),(2, 3),(3, 5)
select * from #temp
;with cte as
(
select id, no, no-1 nom from #temp
union all
select c.id, c.no, c.nom-1 from cte c inner join #temp t on t.id = c.id and c.nom < t.no and c.nom > 0
)
select * from cte order by 1
drop table #temp

select min value of range [0,44) not in a column

I have a table with an int valued column, which has values between 0 and 43 (both included).
I would like a query that returns the min value of the range [0,44) which is not in the table.
For example:
if the table contains: 3,5, 14. The query should return 0
if the table contains: 0,1, 14. The query should return 2
if the table contains: 0,3, 14. The query should return 1
If the table contains all values, the query should return empty.
How can I achieve that?
Since the value you want is either 0 or 1 greater than a value that exists in the table, you can just do;
SELECT MIN(value)
FROM (SELECT 0 value UNION SELECT value+1 FROM MyTable) a
WHERE value < 44 AND value NOT IN (SELECT value FROM MyTable)
An SQLfiddle to test with.
One way would be to create another table that contains the integers in [0,43] and then left join that and look for NULLs, the NULLs will tell you what values are missing.
Suppose you have:
create table numbers (n int not null);
and this table contains the integers from 0 to 43 (inclusive). If your table is t and has a column n which holds the numbers of interest, then:
select n.n
from numbers n left join t on n.n = t.n
where t.n is null
order by n.n
limit 1
should give you the result you're after.
This is a fairly common SQL technique when you're working with a sequence. The most common use is probably calendar tables.
One approach is to generate a set of 44 rows with integer values, and then perform an anti-join against the distinct set of values from the table, and the grab the mininum value.
SELECT MIN(r.val) AS min_val
FROM ( SELECT 0 AS val UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
-- ...
SELECT 44
) r
LEFT
JOIN ( SELECT t.int_valued_col
FROM mytable t
WHERE t.int_valued_col >= 0
AND t.int_valued_col <= 43
GROUP BY t.int_valued_col
) v
ON v.int_valued_col = r.col
WHERE v.int_valued_col IS NULL
A little bit hacky and MySQL-specific:
SELECT NULLIF(MAX(IF(val=#min, #min:=(val+1), #min)), #max) as min_empty
FROM (
SELECT DISTINCT val
FROM table1
-- WHERE val BETWEEN 0 AND 43
ORDER BY val) as vals, (SELECT #min:=0, #max:=44) as init;