I found the solution of doing it, but I don't like it. Because each time we change one relation we must duplicate all tree records instead of one record. Who have any ideas how to do it in other way, or optimize my - you are welcome! Thanks!
The simpliest self to self tree table:
`categoies`
id|parent_id|label|
--+---------+------
1 NULL A
2 1 B
3 2 C
4 3 D
5 4 E
"A > B > C > D > E" nested set
I want to save all relation changes. I think that I need to do it in this way:
add table revisions
`revisions`
id | created |
---+----------------------
1 2015-02-02 09:00:00
2 2015-03-02 09:01:00
3 2015-04-02 09:00:00
4 2015-04-07 09:02:00
change table categories
`categoies`
id|label|
--+-------
1 A
2 B
3 C
4 D
5 E
add table category_revisions
`category_revisions`
id|rev_id|category_id|parent_category_id|
--+------+-----------+-------------------
1 1 1 null
2 1 2 1
3 1 3 2
4 1 4 3
5 1 5 4 # A > B > C > D > E
6 2 1 null
7 2 2 1
8 2 3 1
9 2 4 3
10 2 5 4 # A > B
# > C > D > E
Related
I have a Dataframe like this
Sys_id
Id
A
4
A
5
A
6
A
100
A
2
A
3
A
4
A
5
A
6
A
7
B
100
B
2
B
3
B
4
B
5
B
6
B
100
I want to fetch the Id's between Id==100 how can I get that by partition using sys_id
I want an output like this
Sys_id
Id
A
2
A
3
A
4
A
5
A
6
A
7
B
2
B
3
B
4
B
5
B
6
I tried using
Windowspec=Window.partitionBy("sys_id").orderBy("timestamp")
df=df.withColum("id",(df.id==100).cast("int")
df=df.withColumn("next_id",lead('id',1).over(Windowspec))
Is there any alternative way to get the answer?
So I have the following chart here where I have columns:
a b c d
1 1 1 3
1 1 1 4
1 1 1 5
2 2 2 1
2 2 2 3
2 2 2 3
3 3 3 4
3 3 3 5
3 3 3 6
What I want to do is add and average column d where columns a,b,c contain the same values. How would I go about doing this?
I imagine it would be something like
Select SUM(Table.d) where a = b AND b = c AS e
Try this:
Select SUM(Table.d), AVG(Table.d) from Table Group by Table.a, Table.b, Table.c
I have many files, but I can not find how to bind column.
For example, files are followed
[1.txt]
ID Score
A 1
B 2
C 3
D 4
[2.txt]
ID Score
A 2
B 2
C 3
D 4
[3.txt]
ID Score
A 4
B 4
C 5
D 3
I want to make
A 1 2 4
B 2 2 4
C 3 3 5
D 4 4 3
You could use cbind() as follows:
df_final <- cbind(cbind(df1, df2["Score"]), df3["Score"])
df_final
ID Score Score Score
1 A 1 2 4
2 B 2 2 4
3 C 3 3 5
4 D 4 4 3
Note that if you were trying to match IDs between data frames which did not coincidentally have the order you want, then you would be asking more for a database style join. In this case, R offers the merge() function from baseR.
Demo
I'm using the clustercommand and am having difficulties due to insufficient memory. To get around this problem I would like to delete all duplicate observations.
I would like to cluster via the variables A, B and C and I identify duplicate values as so:
/* Create dummy data */
input id A B C
1 1 1 1
2 1 1 1
3 1 1 1
4 2 2 2
5 2 2 2
6 2 2 2
7 2 2 2
8 3 3 3
9 3 3 3
10 4 4 4
end
sort A B C id
duplicates tag A B C, gen(dup_tag)
I would like to add a variable dup_ID which tells me that ids 2 and 3 are duplicates of id 1, ids 5 and 6 of id 4, and so on. How could I do this?
/* Desired result */
id A B C dup_id
1 1 1 1 1
2 1 1 1 1
3 1 1 1 1
4 2 2 2 4
5 2 2 2 4
6 2 2 2 4
7 2 2 2 4
8 3 3 3 8
9 3 3 3 8
10 4 4 4 10
duplicates is a wonderful command (see its manual entry for why I say that), but you can do this directly:
bysort A B C : gen tag = _n == 1
tags the first occurrence of duplicates of A B C as 1 and all others as 0. For the other way round use _n > 1, _n != 1, or whatever.
EDIT:
So then the id of tagged observations is just
by A B C: gen dup_id = id[1]
For basic technique with by: see (e.g.) this discussion
You can refer to the first observation in each group of A B C using the subscript [1] on ID. Note the (id) argument in bysort, which sorts by id, but identifies the groups by A, B, and C only.
clear
input id A B C
1 1 1 1
2 1 1 1
3 1 1 1
4 2 2 2
5 2 2 2
6 2 2 2
7 2 2 2
8 3 3 3
9 3 3 3
10 4 4 4
end
bysort A B C (id): gen dup_id = id[1]
li, noobs sepby(dup_id)
yielding
+-------------------------+
| id A B C dup_id |
|-------------------------|
| 1 1 1 1 1 |
| 2 1 1 1 1 |
| 3 1 1 1 1 |
|-------------------------|
| 4 2 2 2 4 |
| 5 2 2 2 4 |
| 6 2 2 2 4 |
| 7 2 2 2 4 |
|-------------------------|
| 8 3 3 3 8 |
| 9 3 3 3 8 |
|-------------------------|
| 10 4 4 4 10 |
+-------------------------+
I am trying make a complex query in MySQL i have following two tables
departments employees
Id parent title id name department status
--------------------------- ------------------------------------
1 0 Health 1 abc 3 1
2 0 Sports 2 def 3 1
3 0 Education 3 ghi 5 1
4 1 Physical 4 jkl 10 1
5 1 Mental 5 kkk 6 1
6 2 Football 6 lll 6 1
7 2 Baseball 7 sss 8 1
8 2 Beachball 8 xxx 6 1
9 2 Hockey 9 yyy 6 1
10 4 ENT 10 zzz 7 1
11 0 Finance 11 mnb 11 1
Departments table have four main departments(i-e: parent = 0) with multiple level depths of sub-departments.
Currently i have done it through a PHP function by running queries multiple time to achieve this, but still i want know if this is possible to fetch it with a Query.
Whats the best way OR how to select max 3 employee randomly for each main department with status 1.
The expected result should be something like
id name department maindep
1 abc 3 3
2 def 3 3
3 ghi 5 1
4 jkl 10 1
5 kkk 6 2
7 sss 8 2
10 zzz 7 2
11 mnb 11 11