I have ranked rows based on certain criteria. Lets call the rank columns as
time_rank and id_rank.
Window_id time_rank id_rank
1 1 1
1 2 1 --> 1 is already considered, reject this row
2 1 1
2 1 2 --> 1 is already considered, reject this row
2 2 2
3 1 1
3 2 1 --> 1 is already considered, reject this row
3 1 2
3 3 2
4 1 1
4 2 1 --> 1 is already considered, reject this row
4 2 2
4 3 1 --> 1 is already considered, reject this row
Tried few tricks with lag , another ranking and self join. None seem to work. I need to find the unique combination with no repetition:
Desired Output :
Window_id time_rank id_rank
1 1 1
2 1 1
2 2 2
3 1 1
3 3 2
4 1 1
4 2 2
Related
I have a target where I have this DUPLICATE checker with COUNT for multiple arrays. As we can see. Any suggestions on how I make this work? Thank you and have a nice day.
DATA:
id
value
1
[5,6,8,4,2]
2
[2,3,4,1,8]
3
[9,3,2,1,10]
Normal result:
number
count
1
2
2
3
3
2
4
2
5
1
6
1
7
0
8
2
9
1
10
1
This is my target result with sorting (Highest count):
number
count
2
3
1
2
3
2
4
2
8
2
5
1
6
1
10
1
9
1
Unable to write out the SQL Query based on complex logic defined below.
Database: MySQL
Input :
id userId parentid
----------------------
1 1 0
2 2 1
3 3 1
4 4 1
5 5 2
6 6 3
7 7 0
8 8 0
Output Expected if userId = 1 :
id userId parentid
----------------------
1 1 0
2 2 1
3 3 1
4 4 1
5 5 2
6 6 3
Logic Used:
if userId = 1 then check for parentId whose value is 1, the records are (we have to include the record of userID =1 as well)
id userId parentid
----------------------
1 1 0
2 2 1
3 3 1
4 4 1
Now in above record set check user id again in this example apart form userId = 1, there are three userId's i.e. 2, 3, 4. Now we have to see whose parent id is 2, 3 and 4. Now records are.
id userId parentid
----------------------
1 1 0
2 2 1
3 3 1
4 4 1
5 5 2
6 6 3
if we userId column again new userid visible are 5 and 6, but there is no parent id for 5 and 6. So this the final result set.
I'm using the clustercommand and am having difficulties due to insufficient memory. To get around this problem I would like to delete all duplicate observations.
I would like to cluster via the variables A, B and C and I identify duplicate values as so:
/* Create dummy data */
input id A B C
1 1 1 1
2 1 1 1
3 1 1 1
4 2 2 2
5 2 2 2
6 2 2 2
7 2 2 2
8 3 3 3
9 3 3 3
10 4 4 4
end
sort A B C id
duplicates tag A B C, gen(dup_tag)
I would like to add a variable dup_ID which tells me that ids 2 and 3 are duplicates of id 1, ids 5 and 6 of id 4, and so on. How could I do this?
/* Desired result */
id A B C dup_id
1 1 1 1 1
2 1 1 1 1
3 1 1 1 1
4 2 2 2 4
5 2 2 2 4
6 2 2 2 4
7 2 2 2 4
8 3 3 3 8
9 3 3 3 8
10 4 4 4 10
duplicates is a wonderful command (see its manual entry for why I say that), but you can do this directly:
bysort A B C : gen tag = _n == 1
tags the first occurrence of duplicates of A B C as 1 and all others as 0. For the other way round use _n > 1, _n != 1, or whatever.
EDIT:
So then the id of tagged observations is just
by A B C: gen dup_id = id[1]
For basic technique with by: see (e.g.) this discussion
You can refer to the first observation in each group of A B C using the subscript [1] on ID. Note the (id) argument in bysort, which sorts by id, but identifies the groups by A, B, and C only.
clear
input id A B C
1 1 1 1
2 1 1 1
3 1 1 1
4 2 2 2
5 2 2 2
6 2 2 2
7 2 2 2
8 3 3 3
9 3 3 3
10 4 4 4
end
bysort A B C (id): gen dup_id = id[1]
li, noobs sepby(dup_id)
yielding
+-------------------------+
| id A B C dup_id |
|-------------------------|
| 1 1 1 1 1 |
| 2 1 1 1 1 |
| 3 1 1 1 1 |
|-------------------------|
| 4 2 2 2 4 |
| 5 2 2 2 4 |
| 6 2 2 2 4 |
| 7 2 2 2 4 |
|-------------------------|
| 8 3 3 3 8 |
| 9 3 3 3 8 |
|-------------------------|
| 10 4 4 4 10 |
+-------------------------+
item qty
1201-10-005-A 1
1110-01-006-A 1
1112-01-006-A 1
1202-01-008-A 1
1202-01-023-A 1
G-1000-00-003-A 1
Q-2252-00-004-D 1
1150-01-002-A 1
1201-01-009-A 1
1201-01-010-A 1
1201-01-012-A 1
1201-01-013-A 1
1201-02-005-A 1
1201-02-006-A 1
1201-04-001-A 1
1201-05-001-A 1
1201-06-002-A 1
1201-06-003-A 1
1201-06-004-A 1
1201-07-001-A 1
1201-07-002-A 1
1201-07-005-A 1
1201-07-006-A 1
1201-07-009-A 1
1201-07-007-A 1
1201-06-004-A 2
1201-07-001-A 2
1201-07-002-A 2
1201-07-005-A 2
1201-07-006-A 2
1201-07-007-A 2
1201-07-009-A 2
1201-10-005-A 2
1202-01-008-A 2
1202-01-023-A 2
1110-01-006-A 2
1201-06-004-A 3
1201-07-001-a 3
1201-07-002-A 3
1201-07-005-A 3
1201-07-006-a 3
1201-07-007-A 3
1201-07-009-A 3
1201-10-005-A 3
1202-01-008-A 3
1202-01-023-A 3
1110-01-006-A 3
1130-03-009-A 3
1201-06-004-A 4
1201-07-001-A 4
1201-07-002-A 4
1201-07-005-A 4
1201-07-006-A 4
1201-07-007-A 4
1201-07-009-A 4
1201-10-005-A 4
1202-01-008-A 4
1202-01-023-A 4
1110-01-006-A 4
1130-03-009-A 4
1110-01-006-A 5
1130-03-009-A 5
1201-01-009-A 1
0004-08-107-A 1
0010-08-012-A 1
1000-00-003-B 1
Same item repeat show max quantuty value ony
You need to use Group By:
select item,max(quantity)
from table
group by item
Im kinda new reading stored procedure.
I was thinking if this is posible to do in store procedure in mysql.
I have sequence of approval process called step. approved column; 1 is yes 0 is no.
Basically I have Step 1 to 3..in my approval sequence.
if step 1 approved status is 0 he will be the first to approved or see the table.
if step 1 approve is 1. step 2 can now see the table.
Transaction Steps Table:
id transaction_id approver_id step approved
1 1 1 1 1
2 1 2 2 0
3 1 3 3 0
4 2 3 1 1
5 2 1 2 1
6 2 2 3 0
7 3 2 1 0
8 3 3 2 0
9 3 1 3 0
10 4 1 1 1
11 4 3 2 0
12 4 2 3 0
Example If my Approval id = 2
In My View:I can only see all those next in que approvals
id transaction_id approver_id step approved
2 1 2 2 0
6 2 2 3 0
7 3 2 1 0
pls let me know if this is possible. thank you
If I understand correctly, you want rows that are the first non-approved for each transaction and the approver is 2.
Try this:
select ts.*
from transactionsteps ts join
(select transaction_id, min(step) as minstep
from transactionsteps
where approved = 0
group by transaction_id
) t
on ts.transaction_id = t.transaction_id and
ts.step = t.minstep
where approver_id = 2;