I have a data look like the following:
Mike 5
Mike 100
Mike 101
Mike 106
Mike 95
Mike 1000
Mike 1001
Mike 1010
Jen 2006
Jen 2001
Jen 2010
Jen 3000
Jen 10
I want to cluster the numbers by absolute value of 20, and leave the smallest one in each cluster.
The result looks like this:
Mike 5
Mike 95
Mike 1000
Jen 2006
Jen 3000
Jen 10
Is there any way to do this?
I have thought about GROUP BY with intervals,
but it does not make sense if the cluster cross the intervals,
for an example, if I set the ranges are
1-20, 21-40, 41-60
but if my data have:
Mike 35
Mike 39
Mike 41
Mike 45
it will be split into two clusters
Mike 35
Mike 41
what I want:
Mike 35
Thanks!
If I understand correctly, you want the "smallest" value for each name to start a "cluster". That cluster in turn contains all rows for the same name within a value of 20. This is then repeated for the remaining clusters.
This suggests a recursive CTE:
with recursive tn as (
select t.*, row_number() over (partition by name order by val) as seqnum
from t
),
cte as (
select name, val, seqnum, val as cluster_val, 1 as cluster_num
from tn
where seqnum = 1
union all
select cte.name, tn.val, tn.seqnum,
(case when tn.val < cte.cluster_val + 20 then cte.cluster_val else tn.val end) as cluster_val,
(case when tn.val < cte.cluster_val + 20 then cte.cluster_num + 1 else 1 end) as cluster_num
from cte join
tn
on tn.name = cte.name and tn.seqnum = cte.seqnum + 1
)
select *
from cte
where cluster_num = 1
order by name, val;
Here is a db<>fiddle.
Related
For example, I have the following table :
id user_id name age address
1 12 John 21 earth
2 13 Daniel 19 planet
3 12 Paul 25 here
4 11 Joana 23 mars
5 11 Paul 18 earth
The results that I want :
id user_id name age address
1 12 John 21 earth
3 12 Paul 25 here
4 11 Joana 23 mars
5 11 Paul 18 earth
So basically, I want to show all rows from duplicated values in the user_id column. I am new to SQL and hopefully, you guys can help me. Thanks in advance.
You can do something like below.
select * from your_table where user_id in (
select user_id from your_table
group by user_id having count(*) > 1
)
I would recommend exists for this purpose:
select t.*
from t
where exists (select 1 from t t2 where t2.user_id = t.user_id and t2.id <> t.id)
order by user_id, id;
In general, it is best to avoid aggregation functions in subqueries if you can -- for performance reasons.
I am trying to grab a participants rankings in a multi-event tournament.
I can do a ranking for a single event pretty easily. Is there a way to find ALL in one go?
Given input: "Bob"
Data example: Desired output:
Name | Event | Score Name | Event | Score | Rank
-------------------- ----------------------------
Bob 1 100 Bob 1 100 1
Bob 2 75 Bob 2 75 3
Bob 3 80 Bob 3 80 2
Jill 2 90
Jill 3 60
Chris 1 70
Chris 2 50
Chris 3 100
Amy 1 85
Amy 2 95
Amy 3 65
The catch: I do not have access to the Rank()
function with my version of SQL, and updating is not possible in this scenario.
Clearly I could just do the score per event separately in a loop,
but I'd like to try to do it all in one go.
You can emulate a ranking function in MySQL using a self-join to values with a higher score in the same Event, and then counting the number of higher scores for each participant:
SELECT s1.Name, s1.Event, s1.Score, COUNT(s2.Name)+1 AS Rank
FROM scores s1
LEFT JOIN scores s2 ON s2.Event = s1.Event AND s2.Score > s1.Score
WHERE s1.Name = 'Bob'
GROUP BY s1.Name, s1.Event, s1.Score
ORDER BY s1.Name, s1.Event
Output:
Name Event Score Rank
Bob 1 100 1
Bob 2 75 3
Bob 3 80 2
Demo on dbfiddle
Imagine you have a members with distinct member_ids and dates of service
you now need to order the dates of service in ascending order and return the order of these dates in another column (date_count). the final result will look like this:
memberid name date date_count
122 matt 2/8/12 1
122 matt 3/9/13 2
122 matt 5/2/14 3
120 luke 11/15/11 1
120 luke 12/28/14 2
100 john 1/12/10 1
100 john 3/2/12 2
100 john 5/30/12 3
150 ore 5/8/14 1
150 ore 9/9/14 2
here is the query that works but does not return the date_count in ranking (1,2,3) order. This instead returns the same number for date_count, not sure why the num
memberid name date_count
122 matt 3
122 matt 3
122 matt 3
120 luke 5
120 luke 5
120 luke 5
100 john 6
100 john 6
150 ore 2
150 ore 2
SELECT A.MEMBERID, A.NAME,A.DATE, COUNT(B.DATE) AS DATE_COUNT FROM #WCV_COUNTS A
INNER JOIN #WCV_COUNTS B
ON A.MEMBERID <= B.MEMBERID
AND A.MEMBERID= B.MEMBERID
GROUP BY A.MEMBERID, A.NAME, A.DATE
ORDER BY A.MEMBERID
Thanks for help in advance!
Use ROW_NUMBER()
SELECT memberid, name, date,
ROW_NUMBER() OVER (PARTITION BY memberid ORDER BY date) AS date_count
FROM #WCV_COUNTS
ORDER BY memberid, date
I hope someone can help. I have a table....
name date number
John 2014-01-01 5
Sally 2014-01-01 7
John 2013-12-24 2
Sally 2013-12-24 7
John 2013-11-10 1
Sally 2012-11-10 8
I want to get the latest 2 (or x) records for each person eg
John 2014-01-01 5
John 2013-12-24 2
Sally 2014-01-01 7
Sally 2013-12-24 7
I don't know where to start. If anyone can shed some light on this I would be really grateful. It would also be good (if you have time) if you could explain the solution for learning purposes!
Many thanks
Jules
SELECT x.*
FROM my_table x
JOIN my_table y
ON y.name = x.name
AND y.date >= x.date
GROUP
BY x.name
, x.date
HAVING COUNT(*) <= 2
ORDER
BY name,date DESC;
Select Count(Distinct Names) "yourtable"
Group By Names
the result of count you will multiply by x and tis will be the limit of your next query meaning:
Select Distinct Names,date,number from "yourtablename"
limit Count(Distinct Names)*X
order by Names
If my Data is
Name - playerID - matchID - Innings - Runs
James 1 1 1 5
James 1 1 2 8
Darren 2 1 1 3
Darren 2 1 2 9
James 1 2 1 10
James 1 2 2 12
Darren 2 2 1 13
Darren 2 2 2 19
and my sql data is
$query = "SELECT playerID, name,
SUM(runs) AS runs_scored,
MAX(runs) AS highest_score
FROM matchPlayer GROUP BY playerID";
Then the output would read
James has scored 35 runs with a highest score of 18
Darren has scored 44 runs with a highest score of 19
Now I wish to get the highest total scored in one match (that is combining innings 1 & 2)?
I have no idea how to start on this query :(
EDIT
The exact info I require is the HIGHEST match total, so James has 13 combined runs from matchID 1 and 22 combined runs from matchID 2 - so the answer I am after is 22.
You need to do it in two stages:
SELECT ms.playerID, mp.name, SUM(ms.runs_by_match) AS runs_scored,
MAX(ms.runs_by_match) as highest_score
FROM
matchPlayer as mp
INNER JOIN (
SELECT playerID, matchID, SUM(runs) AS runs_by_match
FROM matchPlayer
GROUP BY playerID, matchID
) AS ms ON mp.playerID = ms.playerID
GROUP BY
ms.playerID, mp.name