I have a table like:
Name | ID | Event
Smith| 1 |
Smith| 2 | Y
Smith| 3 |
Jones| 1 |
Jones| 2 | Y
Jones| 3 |
Jones| 4 | Y
I'd like to count the number of times an Event has been seen for each person at each point, e.g.:
Name | ID | Event | Event Count
Smith| 1 | | 0
Smith| 2 | Y | 1
Smith| 3 | | 1
Jones| 1 | | 0
Jones| 2 | Y | 1
Jones| 3 | | 1
Jones| 4 | Y | 2
I'm guessing I can't do this in SQL? If not, can you be very clear how I go about doing this in SAS (or whatever way is appropriate), as I am new to this!
(FYI, this is leading to me being able to differentiate rows that happen before or after each event - i.e. filter by Event = blank, and anything 0 happened before the first event, anything 1 after, etc. There might be an easier way to do this.)
Thanks!
If you want to go down the SAS route, it reads data sequentially so is very good at this type of problem
data have;
infile datalines missover;
input Name $ ID Event $;
datalines;
Smith 1
Smith 2 Y
Smith 3
Jones 1
Jones 2 Y
Jones 3
Jones 4 Y
;
run;
proc sort data=have;
by name id;
run;
data want;
set have;
by name id;
if first.name then event_count=0;
event_count+(event='Y');
run;
You could potentially do something like this in a query:
select Name, ID, Event,
(
select count(*)
from MyTable
where Name = t.Name
and Event = 'Y'
and ID <= t.ID
) as EventCount
from MyTable t
The correlated subquery will find this count for you, though this is something of a triangular join (SQL Server link, but still applicable), so performance isn't wonderful.
Here is the SQL Fiddle showing the result.
Note that this should work in virtually any RDBMS.
SELECT Name, ID, Event, grpTotal
FROM
(
select Name,
ID,
Event,
#sum := if(#grp = Name,#sum,0) + if(`Event` = 'Y',1,0) as grpTotal,
#grp := Name
from TableName,
(select #grp := '', #sum := 0) vars
order by Name, ID
) s
SQLFiddle Demo
Related
I have a table of owners
id | owner
--------
1 | Jack
2 | Lee
3 | Daniel
and a table of their transactions
id | owner_id | change
----------------
1 | 1 | 500
2 | 2 | 300
3 | 1 | -100
4 | 2 | 100
5 | 2 | -300
and I'm trying to get the balance of Jack's account. So for example here I would return
500
400
as Jack will first have 500 as his balance and after the change he will have 400.
What I currently have is
SELECT O.id, change FROM Owners O, Transactions WHERE O.id = 1 & Transactions.owner_id = 1;
but I can only get the rows of Jack's change. What can I do to get the balance for each row?
If you are using MySQL 8, you can do something like this:
SELECT SUM(change_v) OVER(ORDER BY id) AS balance
FROM transactions
WHERE owner_id = 1
An alternative could be to use a variable, like this:
SET #balance := 0;
SELECT (#balance := #balance + change_v) AS balance
FROM transactions
WHERE owner_id = 1
ORDER BY id
Keep in mind that change is a reserved word in MySQL, that's why I simply used the name change_v. There are ways you can use reserved words for column names, but I wouldn't advise you to do so.
I'm offering an experience leaderboard for a Discord bot I actively develop with stuff like profile cards showing one's rank. The SQL query I'm currently using works flawlessly, however I notice that this query takes a rather long processing time.
SELECT id,
discord_id,
discord_tag,
xp,
level
FROM (SELECT #rank := #rank + 1 AS id,
discord_id,
discord_tag,
xp,
level
FROM profile_xp,
(SELECT #rank := 0) r
ORDER BY xp DESC) t
WHERE discord_id = '12345678901';
The table isn't too big (roughly 20k unique records), but this query is taking anywhere between 300-450ms on average, which piles up relatively fast with a lot of concurrent requests.
I was wondering if this query can be optimized to increase performance. I've isolated this to this query, the rest of the MySQL server is responsive and swift.
I'd be happy about any hint and thanks in advance! :)
You're scanning 20,000 rows to assign "row numbers" then selecting exactly one row from it. You can use aggregation instead:
SELECT *, (
SELECT COUNT(*)
FROM profile_xp AS x
WHERE xp > profile_xp.xp
) + 1 AS rnk
FROM profile_xp
WHERE discord_id = '12345678901'
This will give you rank of the player. For dense rank use COUNT(DISTINCT xp). Create an index on xp column if necessary.
Not an answer; too long for a comment:
I usually write this kind of thing exactly the same way that you have done, because it's quick and easy, but actually there's a technical flaw with this method - although it only becomes apparent in certain situations.
By way of illustration, consider the following:
DROP TABLE IF EXISTS ints;
CREATE TABLE ints (i INT NOT NULL PRIMARY KEY);
INSERT INTO ints VALUES
(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
Your query:
SELECT a.*
, #i:=#i+1 rank
FROM ints a
JOIN (SELECT #i:=0) vars
ORDER
BY RAND() DESC;
+---+------+
| i | rank |
+---+------+
| 3 | 4 |
| 2 | 3 |
| 5 | 6 |
| 1 | 2 |
| 7 | 8 |
| 9 | 10 |
| 4 | 5 |
| 6 | 7 |
| 8 | 9 |
| 0 | 1 |
+---+------+
Look, the result set isn't 'random' at all. rank always corresponds to i
Now compare that with the following:
SELECT a.*
, #i:=#i+1 rank
FROM
( SELECT * FROM ints ORDER by RAND() DESC) a
JOIN (SELECT #i:=0) vars;
+---+------+
| i | rank |
+---+------+
| 5 | 1 |
| 2 | 2 |
| 8 | 3 |
| 7 | 4 |
| 4 | 5 |
| 6 | 6 |
| 0 | 7 |
| 1 | 8 |
| 3 | 9 |
| 9 | 10 |
+---+------+
Assuming discord_id is the primary key for the table, and you're just trying to get one entry's "rank", you should be able to take a different approach.
SELECT px.discord_id, px.discord_tag, px.xp, px.level
, 1 + COUNT(leaders.xp) AS rank
, 1 + COUNT(DISTINCT leaders.xp) AS altRank
FROM profile_xp AS px
LEFT JOIN profile_xp AS leaders ON px.xp < leaders.xp
WHERE px.discord_id = '12345678901'
GROUP BY px.discord_id, px.discord_tag, px.xp, px.level
;
Note I have "rank" and "altRank". rank should give you a similar position to what you were originally looking for; your results could have fluctuated for "ties", this rank will always put tied players at their highest "tie". If 3 records tie for 2nd place, those (queried separately with this) will show 2nd place, the next xp down would should 5th place (assuming 1 in 1st, 2,3,4 in 2nd, 5 in 5th). The altRank would "close the gaps" putting 5 in the 3rd place "group".
I would also recommend an index on xp to speed this up further.
If I have the table:
------------------
| Provider | ID |
------------------
| X | 125 |
------------------
| X | 133 |
------------------
| X | 342 |
------------------
| X | 327 |
------------------
| Y | 123 |
------------------
| Y | 853 |
------------------
| Y | 123 |
------------------
| Z | 853 |
------------------
| Z | 533 |
------------------
| Z | 174 |
------------------
I want to get 2 random entries from each of the providers X and Y (ignoring Z) to produce
X id
X id
Y id
Y id
I've tried several queries including
select id, provider from tableName a where (SELECT COUNT(*) FROM tableName b where b.provider = a.provider) = 2;
Any ideas?
Order by rand() and limit to 2 with a where clause for x and y and union all the results. Plainly obvious what you're trying to do and easy to maintain.
SELECT Provider, ID
FROM tableName
WHERE provider = 'X'
ORDER BY Rand()
LIMIT 2
UNION ALL
SELECT Provider, ID
FROM tableName
WHERE provider = 'Y'
ORDER BY Rand()
LIMIT 2
How about you try this query:
select provider, ID, Rank from
(
select *, #row_num := IF(#prev_value=provider,#row_num+1,1) AS rank, #prev_value := provider from
(
select provider, ID, rand() as SortingField from yourTable
order by provider, SortingField
) t1,
(SELECT #rownum := 0) x, (SELECT #prev_value := '') y
) src
where Rank <= 2 and Provider <> 'Z'
Okay, so here's how it works. First thing first, you need random entries so you need to add a "random" sorting field which I did using mySql's rand() function. Since it gives a random number, your fields will always be sorted randomly.
This sets us up for the next bit of your requirement which is the two random entries. Since your field are sorted randomly, we'll just pick the first two records which are going to be different every time the query runs. To do that, we're going to need a line counter that resets every time the desired field changes value (in your case, it's 'provider'). So that's what we're doing using variable:
#rownum is your row counter and is incremented by 1 every time a new record shows up.
#prev_value is your value checker that you use to determine when the counter needs to be reset. Notice that it's set after the #row_num incrementation, which is critical because if you set it before, you'd be checking against the current value, which would be pointless.
And at last, all you have to do is select the desired fields (provider, ID, Rank) and grab ranks that are inferior to 2 and ignore the 'Z' provider which is what the last part of the query does.
If you have questions, don't hesitate to ask, but I hope my explanation was clear enough.
I have a table which looks like this:
+-----------------------
| id | first_name
+-----------------------
| AC0089 | John |
| AC0015 | Dan |
| AC0017 | Marry |
| AC0003 | Andy |
| AC0001 | Trent |
| AC0006 | Smith |
+-----------------------
I need a query to split the id in the range of 3 and also display the starting id of that range i.e.
+------------+----------+--------
| startrange | endrange | id
+------------+----------+--------
| 1 | 3 | AC0089
| 4 | 6 | AC0003
+------------+----------+--------
I am pretty new to SQL and trying the below query but I dont think I am near to the correct solution at all ! Here is the query:
select startrange, endrange, id from table inner join (select 1 startRange, 3 endrange union all select 4 startRange, 6 endRange) r group by r.startRange, r.endRange;
It is giving the same id every-time and I am not able to come up with any other solution. How Can I get the required output?
Try this
SET #ct := 0;
select startrange,(startrange + 2) as endrange, seq_no from
(select (c.st - (select count(*) from <table_name>)) as startrange, c.* from
(select (#ct := #ct + 1) as st, b.* from <table_name> as b) c
having startrange mod 3 = 1) as cc;
sorry for formating.
I'm not completely sure what your trying to do but if you're trying to convert a table of ID's into ranges use a case when.
CASE WHEN startrange in(1,2,3) THEN 1
ELSE NULL
END as startrange,
CASE WHEN endrange in(1,2,3) THEN 3
ELSE NULL
END as endrange,
CASE WHEN ID in(1,2,3) THEN id
WHEN ID in(4,5,6) THEN id
ELSE id
END AS ID
The subject of the question is not very explanatory, sorry for that.
Ya so the question follows:
I have a database structure as below where pk is primary key, id
is something which is multiple for many rows.
+------+------+---------------------+
| pk | id | value |
+------+------+---------------------+
| 99 | 1 | 2013-08-06 11:10:00 |
| 100 | 1 | 2013-08-06 11:15:00 |
| 101 | 1 | 2013-08-06 11:20:00 |
| 102 | 1 | 2013-08-06 11:25:00 |
| 103 | 2 | 2013-08-06 15:10:00 |
| 104 | 2 | 2013-08-06 15:15:00 |
| 105 | 2 | 2013-08-06 15:20:00 |
+------+------+---------------------+
What is really need to get is, value difference between first two rows (which is ordered by value) for each
group (where group is by id). So according to above structure I need
timediff(value100, value99) [ which is for id 1 group]
and timediff(value104, value103) [ which is for id 2 group]
i.e. value difference of time ordered by value for 1st two rows in each group.
One way i can think to do is by 3 self joins (or 3 sub queries) so as to find the
first two in 2 of them , and third query subtracting it. Any suggestions?
try this.. CTE is pretty powerfull!
WITH CTE AS (
SELECT
value, pk, id,
rnk = ROW_NUMBER() OVER ( PARTITION BY id order by id DESC)
, rownum = ROW_NUMBER() OVER (ORDER BY id, pk)
FROM test
)
SELECT
curr.rnk, prev.rnk, curr.rownum, prev.rownum, curr.pk, prev.pk, curr.id, prev.id, curr.value, prev.value, curr.value - prev.value
FROM CTE curr
INNER JOIN CTE prev on curr.rownum = prev.rownum -1 and curr.id = prev.id
and curr.rnk <=1
Looks a bit wierd... But you can try this way
SET #previous = 0;
SET #temp = 0;
SET #tempID = 0;
Above step may not be needed .. But just to make sure nothing goes wrong
SELECT pkid, id, diff, valtemp FROM (
SELECT IF(#previousID = id, #temp := #temp + 1, #temp := 1) occ, #previousID := id,
TIMEDIFF(`value`, #previous) diff, pk, id, `value`, #previous := `value`
FROM testtable) a WHERE occ = 2
Demo on sql fiddle