Duplicate Value Avoid in sql query - mysql

item qty
1201-10-005-A 1
1110-01-006-A 1
1112-01-006-A 1
1202-01-008-A 1
1202-01-023-A 1
G-1000-00-003-A 1
Q-2252-00-004-D 1
1150-01-002-A 1
1201-01-009-A 1
1201-01-010-A 1
1201-01-012-A 1
1201-01-013-A 1
1201-02-005-A 1
1201-02-006-A 1
1201-04-001-A 1
1201-05-001-A 1
1201-06-002-A 1
1201-06-003-A 1
1201-06-004-A 1
1201-07-001-A 1
1201-07-002-A 1
1201-07-005-A 1
1201-07-006-A 1
1201-07-009-A 1
1201-07-007-A 1
1201-06-004-A 2
1201-07-001-A 2
1201-07-002-A 2
1201-07-005-A 2
1201-07-006-A 2
1201-07-007-A 2
1201-07-009-A 2
1201-10-005-A 2
1202-01-008-A 2
1202-01-023-A 2
1110-01-006-A 2
1201-06-004-A 3
1201-07-001-a 3
1201-07-002-A 3
1201-07-005-A 3
1201-07-006-a 3
1201-07-007-A 3
1201-07-009-A 3
1201-10-005-A 3
1202-01-008-A 3
1202-01-023-A 3
1110-01-006-A 3
1130-03-009-A 3
1201-06-004-A 4
1201-07-001-A 4
1201-07-002-A 4
1201-07-005-A 4
1201-07-006-A 4
1201-07-007-A 4
1201-07-009-A 4
1201-10-005-A 4
1202-01-008-A 4
1202-01-023-A 4
1110-01-006-A 4
1130-03-009-A 4
1110-01-006-A 5
1130-03-009-A 5
1201-01-009-A 1
0004-08-107-A 1
0010-08-012-A 1
1000-00-003-B 1
Same item repeat show max quantuty value ony

You need to use Group By:
select item,max(quantity)
from table
group by item

Related

Unique combination with no repeated individual values

I have ranked rows based on certain criteria. Lets call the rank columns as
time_rank and id_rank.
Window_id time_rank id_rank
1 1 1
1 2 1 --> 1 is already considered, reject this row
2 1 1
2 1 2 --> 1 is already considered, reject this row
2 2 2
3 1 1
3 2 1 --> 1 is already considered, reject this row
3 1 2
3 3 2
4 1 1
4 2 1 --> 1 is already considered, reject this row
4 2 2
4 3 1 --> 1 is already considered, reject this row
Tried few tricks with lag , another ranking and self join. None seem to work. I need to find the unique combination with no repetition:
Desired Output :
Window_id time_rank id_rank
1 1 1
2 1 1
2 2 2
3 1 1
3 3 2
4 1 1
4 2 2

Tag duplicates with first occurrence ID

I'm using the clustercommand and am having difficulties due to insufficient memory. To get around this problem I would like to delete all duplicate observations.
I would like to cluster via the variables A, B and C and I identify duplicate values as so:
/* Create dummy data */
input id A B C
1 1 1 1
2 1 1 1
3 1 1 1
4 2 2 2
5 2 2 2
6 2 2 2
7 2 2 2
8 3 3 3
9 3 3 3
10 4 4 4
end
sort A B C id
duplicates tag A B C, gen(dup_tag)
I would like to add a variable dup_ID which tells me that ids 2 and 3 are duplicates of id 1, ids 5 and 6 of id 4, and so on. How could I do this?
/* Desired result */
id A B C dup_id
1 1 1 1 1
2 1 1 1 1
3 1 1 1 1
4 2 2 2 4
5 2 2 2 4
6 2 2 2 4
7 2 2 2 4
8 3 3 3 8
9 3 3 3 8
10 4 4 4 10
duplicates is a wonderful command (see its manual entry for why I say that), but you can do this directly:
bysort A B C : gen tag = _n == 1
tags the first occurrence of duplicates of A B C as 1 and all others as 0. For the other way round use _n > 1, _n != 1, or whatever.
EDIT:
So then the id of tagged observations is just
by A B C: gen dup_id = id[1]
For basic technique with by: see (e.g.) this discussion
You can refer to the first observation in each group of A B C using the subscript [1] on ID. Note the (id) argument in bysort, which sorts by id, but identifies the groups by A, B, and C only.
clear
input id A B C
1 1 1 1
2 1 1 1
3 1 1 1
4 2 2 2
5 2 2 2
6 2 2 2
7 2 2 2
8 3 3 3
9 3 3 3
10 4 4 4
end
bysort A B C (id): gen dup_id = id[1]
li, noobs sepby(dup_id)
yielding
+-------------------------+
| id A B C dup_id |
|-------------------------|
| 1 1 1 1 1 |
| 2 1 1 1 1 |
| 3 1 1 1 1 |
|-------------------------|
| 4 2 2 2 4 |
| 5 2 2 2 4 |
| 6 2 2 2 4 |
| 7 2 2 2 4 |
|-------------------------|
| 8 3 3 3 8 |
| 9 3 3 3 8 |
|-------------------------|
| 10 4 4 4 10 |
+-------------------------+

R join 2 data frames

Hello i would like to know how can i merge 2 data frames in R,there is a merge function ,but i would like to do this :
data frame1
X Y Z
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
5 1 1 1
data frame 2
A B C
1 2 2 2
2 2 2 2
3 2 2 2
mergedataframe
X Y Z A B C
1 1 1 1
2 1 1 1
3 1 1 1 2 2 2
4 1 1 1 2 2 2
5 1 1 1 2 2 2
the think is i must synchronize 3 csv files (dataframe) and i have no idea how to it with R.
if somebody have any idea about it ,thank u
i redit my post i would like my merged data frame like that :
data frame1
X Y Z
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
5 1 1 1
6 1 1 1
data frame 2
A B C
1 2 2 2
2 2 2 2
mergedataframe
X Y Z A B C
1 1 1 1
2 1 1 1
3 1 1 1 2 2 2
4 1 1 1 2 2 2
5 1 1 1
6 1 1 1
df1 <- data.frame(X=rep(1,5),Y=1, Z=1)
df2 <- data.frame(A=rep(2,3),B=2, C=2)
#rownames(df2) <- 3:5
rownames(df2) <- tail(rownames(df1), nrow(df2))
mergedataframe <- merge(df1,df2, by=0, all=TRUE)
mergedataframe <- mergedataframe[,-1]
mergedataframe
X Y Z A B C
1 1 1 1 NA NA NA
2 1 1 1 NA NA NA
3 1 1 1 2 2 2
4 1 1 1 2 2 2
5 1 1 1 2 2 2

retrieve integer name in shortest.path function of igraph

First, I have a shortest path matrix generated with igraph (shortest path)
When I want to retreive the node names with "get.shortest.path" it just brings me the number of each column and not its name:
[,a] [,b] [,c] [,d] [,e] [,f] [,g] [,h] [,i] [,j]
[a,] 0 1 2 3 4 5 4 3 2 1
[b,] 1 0 1 2 3 4 5 4 3 2
[c,] 2 1 0 1 2 3 4 5 4 3
[d,] 3 2 1 0 1 2 3 4 5 4
[e,] 4 3 2 1 0 1 2 3 4 5
[f,] 5 4 3 2 1 0 1 2 3 4
[g,] 4 5 4 3 2 1 0 1 2 3
[h,] 3 4 5 4 3 2 1 0 1 2
[i,] 2 3 4 5 4 3 2 1 0 1
[j,] 1 2 3 4 5 4 3 2 1 0
then:
get.shortest.paths(g, 5, 1)
the answer is:
[[1]]
[1] 5 4 3 2
I want the node names not their numbers. Is there any solution? I checked vpath, too.
This does the trick for me:
paths <- get.shortest.paths(g, 5, 1)$vpath
names <- V(g)$name
lapply(paths, function(x) { names[x] })
There is a slightly simpler solution that does not use lapply:
paths <- get.shortest.paths(g, 5, 1)
V(g)$name[paths$vpath[[1]]]

How to order by max(a,b) in SQL?

Consider the following table:
id a b
--------------
1 5 1
2 2 3
3 4 2
4 3 6
5 0 1
6 2 2
I would like to order it by max(a,b) in descending order, so that the result will be:
id a b
--------------
4 3 6
1 5 1
3 4 2
2 2 3
6 2 2
5 0 1
What will be the SQL query to perform such ordering ?
Use GREATEST :
SELECT *
FROM table
ORDER BY GREATEST(a, b) DESC