How can i get the median value of each one of grouped values of a collumn from a select statement. I guess i use Row number but not sure how to go about it.
Note: if even number just take either middle no.
name size
joe 10
joe 11
joe 19
joe 20
joe 47
sally 3
sally 8
sally 57
john 1
john 3
I want to get Joe 19, Sally 8, John 3
Set up testing data:
CREATE TABLE #Foo (Name varchar(10), value int)
INSERT #Foo values
('joe', 10)
,('joe', 11)
,('joe', 19)
,('joe', 20)
,('joe', 47)
,('sally', 3)
,('sally', 8)
,('sally', 57)
,('john', 1)
,('john', 3)
First pass, get proper ordering:
SELECT Name, Value, row_number() over (partition by name order by value) ranking
from #Foo
Second pass, identify median item (If even number of items, the "first" item found is always returned)
;WITH cteRankings (Name, Value, Ranking)
as (select Name, Value, row_number() over (partition by name order by value) ranking
from #Foo)
SELECT Name, avg(Ranking) MedianItem
from cteRankings
group by Name
Final pass, get the details for the median item:
;WITH cteRankings (Name, Value, Ranking)
as (select Name, Value, row_number() over (partition by name order by value) Ranking
from #Foo)
SELECT cte.Name, cte.Value
from cteRankings cte
inner join (-- Median items
select Name, avg(Ranking) MedianItem
from cteRankings
group by Name) xx
on xx.Name = cte.Name
and xx.MedianItem = cte.Ranking
order by cte.Name
Done as a cte (common table expression), because a subquery was called for and needed to be referenced twice.
Related
I have a table with an id column and a source column.
I want to return only the source values that all ids share.
E.g. in the table below id 1,2,3 all share 10 and 20, but id 3 is missing the source value 30, so 30 is not valid and I want to return 10 and 20.
I'm using MySQL and want to put this in a stored procedure.
How do I do this?
id
source
1
10
1
20
1
30
2
10
2
20
2
30
3
10
3
20
You may use COUNT(DISTINCT) function as the following:
SELECT source FROM
table_name
GROUP BY source
HAVING COUNT(DISTINCT id)=(SELECT COUNT(DISTINCT id) FROM table_name)
To do this within a stored procedure:
CREATE PROCEDURE getSourceWithAllIds()
BEGIN
SELECT source FROM
table_name
GROUP BY source
HAVING COUNT(DISTINCT id)=(SELECT COUNT(DISTINCT id) FROM table_name);
END
The idea is to select the count of distinct id values for each source, which is done by COUNT(DISTINCT id)... GROUP BY source, then match this count with the distinct count of all id values existed in the table; HAVING COUNT(DISTINCT id)=(SELECT COUNT(DISTINCT id) FROM table_name).
If the two counts are equal, then the source have all the distinct ids existed in the table.
i.e. All distinct ids in the table are (1, 2, 3) count = 3, and distinct ids for a source =10 are (1, 2, 3) count=3. For source = 30, the distinct ids are (1, 2) count=2 so it will not be returned by the query (2<>3).
See a demo.
I'm trying to fetch 2nd MAX END_DATE for all listed customers by passing all customer_id to the WHERE IN clause of subquery. Right now my static customer_id in WHERE IN clause gives me the desired output but the rest customer_id shows NULL.
How can i pass all customer_id dynamically instead of static value in the WHERE IN of subquery. Would be appreciated your help.
SELECT d.customer_id
, ( SELECT DISTINCT g.end_date
FROM contract g
WHERE g.end_date = (SELECT MAX(g.end_date) FROM contract g WHERE g.customer_id IN ('64','65','69')
AND g.customer_id = d.customer_id
AND g.end_date<(SELECT MAX(g.end_date) FROM contract g WHERE g.customer_id IN ('64','65','69')
AND g.customer_id = d.customer_id)
)) END_DATE
FROM customer_vw d
GROUP BY d.customer_id;
Maybe you're overcomplicating it; analytic functions might help in this case. Here's an example - I'd want to select the 2nd highest hiredate from a table per each department:
SQL> break on deptno
SQL> select deptno, hiredate from emp order by deptno, hiredate desc;
DEPTNO HIREDATE
---------- ----------
10 23.01.1982
17.11.1981 --> this, for department 10
09.06.1981
20 12.01.1983
09.12.1982 -->
03.12.1981
02.04.1981
17.12.1980
30 03.12.1981
28.09.1981 -->
08.09.1981
01.05.1981
22.02.1981
20.02.1981
14 rows selected.
So:
SQL> with temp as
2 (select deptno, hiredate,
3 rank() over (partition by deptno order by hiredate desc) rnk
4 from emp
5 )
6 select deptno, hiredate
7 from temp
8 where rnk = 2;
DEPTNO HIREDATE
---------- ----------
10 17.11.1981
20 09.12.1982
30 28.09.1981
SQL>
As of "dynamically passed list of values", in Oracle (as you use Oracle SQL Developer, I (maybe wrongly) presume you actually use that database) you'd split that list of values into rows (that's what subquery in lines #5 - 8 does) and use it in IN clause:
SQL> with temp as
2 (select deptno, hiredate,
3 rank() over (partition by deptno order by hiredate desc) rnk
4 from emp
5 where deptno in (select regexp_substr('&&par_deptno', '[^,]+', 1, level)
6 from dual
7 connect by level <= regexp_count('&&par_deptno', ',') + 1
8 )
9 )
10 select deptno, hiredate
11 from temp
12 where rnk = 2;
Enter value for par_deptno: 10,30
DEPTNO HIREDATE
---------- ----------
10 17.11.1981
30 28.09.1981
SQL>
In SQL Developer, you'd substitute '&&par_deptno' with :par_deptno (i.e. change substitution for bind variable).
You can use LIKE to compare to a string of comma-delimited numbers and the DENSE_RANK analytic function to find the second highest:
SELECT d.customer_id,
( SELECT end_date
FROM (
SELECT end_date,
DENSE_RANK() OVER (ORDER BY end_date DESC) AS rnk
FROM contract g
WHERE ',' || :your_list_of_customers || ',' LIKE '%,' || customer_id || ',%'
AND g.customer_id = d.customer_id
)
WHERE rnk = 2
AND ROWNUM = 1
) AS END_DATE
FROM customer_vw d
GROUP BY
d.customer_id;
Note: Do not use RANK as if there are two (or more) rows tied for first then RANK would return the ranks 1, 1 and 3 for joint 1st and 3rd and there would be no second place. DENSE_RANK solves this by returning 1, 1, 2 in the previous example.
I have a table with three columns Member, id and DOB. I want to assign a id to each unique member. If there is more than one id tagged for a member then I have to assign id with more recurrence. If a tie occurs then I have to assign id with most recent DOB.
4000 8569 11/11/1993
4111 9653 12/11/1993
4000 8569 12/12/1993
5000 5632 01/01/1993
4000 6932 31/12/1993
4111 6987 06/11/1993
5001 4356 01/01/1993
In the above, member's 5000 and 5001 is tagged to single id.. So I should get the same id for that member.. Whereas for member 4000 I am having 3 id's- 2 same ids (8569) and one different id (6987). Here I should have 8569 tagged to this 4000 member. For 4111 member, I am having two different ids (9653 and 6987). So I will see recent DOB for that member. So for 4111 member I will have 9653 tagged to it.
The output should be like this:
4000 8569
4111 9653
5000 5632
5001 4356
I have tried many. But I couldn't get the exact answer. Please help me to solve this. Thanks in advance.
You can do this with window functions in t-sql:
create table #t (
Member int,
id int,
DOB date
);
insert into #t
values (4000, 8569, '1993-11-11'),
(4111, 9653, '1993-11-12'),
(4000, 8569, '1993-12-12'),
(5000, 5632, '1993-01-01'),
(4000, 6932, '1993-12-31'),
(4111, 6987, '1993-11-06'),
(5001, 4356, '1993-01-01');
with cte as
(
select *, count(id) over (partition by member, id) cnt from #t
),
cte2 as
(
select *, row_number() over (partition by member order by cnt desc, dob desc) rn from cte
)
select member, id from cte2 where rn = 1;
drop table #t;
I have one table scenario in which data looks like this .
Request Id Field Id Current Key
1213 11 1001
1213 12 1002
1213 12 103
1214 13 799
1214 13 899
1214 13 7
In this when loop starts for first Request ID then it should check all the field ID for that particular request ID. then data should be look like this .
Request Id Field Id Previous Key Current Key
1213 11 null 1001
1213 12 null 1002
1213 12 1002 103
1214 13 null 799
1214 13 799 899
1214 13 899 7
When very first record for Field id for particular request id come then for it should be take null values in Previous key column and the current key will remain the same.
When the second record will come for same field ID its should take previous value of first record in Previous key column and when third record come it should take previous value of second record in Previous column and so on .
When the new field ID came the same thing should be repeated again.
Please let me know if you need any more info.Much needed your help.
You can check this.
Declare #t table (Request_Id int, Field_Id int, Current_Key int)
insert into #t values (1213, 11, 1001),(1213, 12, 1002), (1213, 12, 103) , (1214, 13, 799), (1214, 13, 899), (1214, 13, 7)
;with cte
as (
select 0 rowno,0 Request_Id, 0 Field_Id, 0 Current_Key
union
select ROW_NUMBER() over(order by request_id) rowno, * from #t
)
select
t1.Request_Id , t1.Field_Id ,
case when t1.Request_Id = t2.Request_Id and t1.Field_Id = t2.Field_Id
then t2.Current_Key
else null
end previous_key
, t1.Current_Key
from cte t1, cte t2
where t1.rowno = t2.rowno + 1
Refer link when you want to compare row value
When the second record will come for same field ID...
Tables don't work this way: there is no way to tell that 1213,12,1002 is the "previous" record of 1213,12,103 as you assume in your example.
Do you have any data you can use to sort your records properly? Request id isn't enough because, even if you guarantee that it increments monotonically for each operation, each operation can include multiple values for the same item id which need to be sorted relative to each other.
IN SQL 2008
You do not have the benefit of the lead and lag functions. Instead you must do a query for the new column. Make sure you query both tables in the same order, and add a row_num column. Then select the greatest row_num that is not equal to the current row_num and has the same request_id and field_id.
select a.request_id,
a.field_id,
(select x.current_key
from (select * from (select t.*, RowNumber() as row_num from your_table t) order by row_num desc) x
where x.request_id = a.request_id
and x.field_id = a.field_id
and x.row_num < a.row_num
and RowNumber()= 1
) as previous_key,
a.current_key
from (select t.*, RowNumber()as row_num from your_table t) a
IN SQL 2012+
You can use the LAG or LEAD functions with the OVER clause to get the previous or next nth row value:
select
Request_Id,
Field_Id,
lag(Current_Key,1) over (partition by Request_ID, Field_ID) as Previous_Key
,Current_Key
from your table
You should probably look at how you order your results too. If you have multiple results lag will only grab the next row in the default order of the table. If you had another column to order by such as a date time you could do the following:
lag(Current_Key,1) over (partition by Request_ID, Field_ID order by timestampColumn)
try this,
declare #tb table (RequestId int,FieldId int, CurrentKey int)
insert into #tb (RequestId,FieldId,CurrentKey) values
(1213,11,1001),
(1213,12,1002),
(1213,12,103),
(1214,13,799),
(1214,13,899),
(1214,13, 7)
select RequestId,t.FieldId,
case when t.FieldId=t1.FieldId then t1.CurrentKey end as PreviousKey,t.CurrentKey from
(select *, ROW_NUMBER() over (order by RequestId,FieldId) as rno
from #tb) t left join
(select FieldId,CurrentKey,
ROW_NUMBER() over (order by RequestId,FieldId) as rno from #tb) t1 on t.rno=t1.rno+1
i have an date like this
Name VALUE
ClientID M01010001250
InterviewType 1
InterviewDate 7/8/2011
ClientID M01010001260
InterviewType 1
InterviewDate 7/8/2011
ClientID M01010001260
InterviewType 5
InterviewDate 1869-07-01
ClientID M01010001290
InterviewType 1
InterviewDate 7/8/2011
now my out put should be like this
SEQ ClientID InterviewType InterviewDate
1 M01100016550 5 9/9/2011
2 M01100016550 5 9/9/2011
3 M01030000680 5 9/9/2011
i have written a query using pivot :
SELECT SEQ,ClientID,InterviewType,InterviewDate
FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY NAME,VALUE ORDER BY NAME,VALUE) AS SEQ,NAME,VALUE
FROM Table1) DT
PIVOT (MAX(VALUE)FOR NAME IN(ClientID,InterviewType,InterviewDate))DT1
ORDER BY SEQ
even though i am using row_number it is not giving desired output suggest me
Your issue here is to group all the rows in three. I have a working solutions :
;WITH MyCTE AS
(
SELECT ROW_NUMBER() OVER (ORDER BY orderby) AS SEQ,
NAME,
VALUE
FROM (
select 1 as orderby,
*
from Table1
)t
)
SELECT SEQ,
ClientID,
InterviewType,
InterviewDate
FROM (
SELECT ((SEQ-1)/3)+1 AS SEQ,
NAME,
VALUE
FROM MyCTE
) DT
PIVOT (
MAX(VALUE)
FOR NAME
IN(ClientID,InterviewType,InterviewDate)
)DT1
ORDER BY SEQ
You may find an SQL Fiddle Demo
Your query isn't working because it's numbering the rows based on their value, so regardless of the order they go in, the rows with the lowest values will be first. Your row which has InterviewType 5 will always have the highest rownumber if the other rows all have InterviewType =1.
Without a way to uniquely identify which entries are supposed to go together, the order returned from queries in SQL server isn't guaranteed. However, if your data is in that exact format mentioned above - so the rows are always in the format ClientId, then InterviewType then InterviewDate, the following should work.
select p.*
from (select *,
CEILING((ROW_NUMBER() OVER (ORDER BY (SELECT 1)) - 1) / 3) as [Row]
from Table1 t) t
PIVOT (max(value) for name in (ClientID, InterviewType, InterviewDate)) p
Output on my test data:
0 M01050001250 16 7/8/2011
1 M01010001260 1 7/8/2011
2 M01010001260 5 1869-07-01
3 M01010001290 1 7/8/2011
(The ceiling function is numbering every three rows - first three are 0, then the next three are 1, etc.)