Finding logical change log - mercurial

I'm trying to figure out a way to get an oddly-specific set of logs. This is input into a different program, where I'm parsing the logs and doing stuff with them, but ideally it would be great to do as much as possible with the hg commands to minimize my post-processing.
I want all commits marked with "O" and none of the "X" ones:
A5 O
|
|
|
A4 O X B4 X C2
|\ | |
| \ | |
| \| |
A3 O O B3 |
| | |
| | |
| | |
A2 O O B2 X C1
| | |
| | |
| | |
| O B1 |
| /| |
| / | |
|/ | |
A1 X X B0 X C0
|
|
|
A0 X
Given this chart, where A, B, and C are different branches, our users want a log of changes between A2 and A5. The issue is that they also want to know the rest of the history of any branches merged into A.
hg log -r A2:A5 will return:
A2,3,4,5
B2,3,4
C1,2
First off, I don't want C whatsoever. It isn't connected to anything here.
But what I do want is B1, or more generally all changes in B since it was last merged into A. Also I don't want B4. So if I have hanging tails that connect farther up, I need to find the rest. Annoyingly, they do not want A1.
My current plan is log A2:A5, then I create a tree structure from parsing the results. At the end I look for any hanging tails, and get the log of the common ancestor of that tail and A2, and A2, for just that branch. That's sort of convoluted and crazy.
Any ideas or suggestions to make this easy and reduce the work I have to do to post-process?

An improvement on your current approach will be hg log -r "A5 % parents(A2)". This is equivalent to A5 and all its ancestors, less A1 and any of its ancestors, so it returns:
A2,A3,A4, and A5
B0,B1,B2, and B3
Notably, the following will be excluded:
Any changeset in C
Changesets in B that hasn't been merged into A (e.g. B4)
There is only one undesired changeset in the resulting revset: B0. The criteria for removing that is a little unclear to me (It would probably help to see the ancestors A0, B0 and C0, as they will all stem from a common node at some point). I think a clarification of the stop conditions going backwards on branches that merge into A2::A5 is needed before a revset can be constructed.
However, that revset will probably be quite complicated, and it may be easier to postprocess the above revset instead.
Edit: Some further thoughts
You may be better off doing multiple different revsets:
hg log -r "A2::A5" returns the DAG from A2 to A5 (i.e. A2,A3,A4,A5)
hg log -r "(parents( A2::A5 & merge() ) - ( A2::A5 + parents(A2) ) )" will return any changeset that has been merged into the DAG from A2 to A5 (i.e. B3)
The first one will go directly to your final result set. The second one you can iterate (Imagine there's also a branch D with a D3 that's merged into the branch of interest) and traverse each branch towards the stop criteria, then add the relevant changesets into the final result set.
Iterating the merged branches to stop at the right time may be simpler than trying to prune a larger result set of the incorrectly included changesets

Related

Whats the best way to retrieve array data from MySql

I'm storing a object / data structure like this inside a MySql (actually a MariaDb) database:
{
idx: 7,
a: "content A",
b: "content B",
c: ["entry c1", "entry c2", "entry c3"]
}
And to store it I'm using 2 tables, very similar to the method described in this answer: https://stackoverflow.com/a/17371729/3958875
i.e.
Table 1:
+-----+---+---+
| idx | a | b |
+-----+---+---+
Table 2:
+------------+-------+
| owning_obj | entry |
+------------+-------+
And then made a view that joins them together, so I get this:
+-----+------------+------------+-----------+
| idx | a | b | c |
+-----+------------+------------+-----------+
| 7 | content A1 | content B1 | entry c11 |
| 7 | content A1 | content B1 | entry c21 |
| 7 | content A1 | content B1 | entry c31 |
| 8 | content A2 | content B2 | entry c12 |
| 8 | content A2 | content B2 | entry c22 |
| 8 | content A2 | content B2 | entry c32 |
+-----+------------+------------+-----------+
My question is what is the best way I can get it back to my object form? (e.g. I want an array of the object type specified above of all entries with idx between 5 and 20)
There are 2 ways I can think of, but both seem to be not very efficient.
Firstly we can just send this whole table back to the server, and it can make a hashmap with the keys being the primary key or some other unique index, and collect up the different c columns, and rebuild it that way, but that means it has to send a lot of duplicate data, and take a bit more memory and processing time to rebuild on the server. This method also won't be very pleasant to scale if we have multiple arrays, or have arrays within arrays.
Second method would be to do multiple queries, filter Table 1 and get back the list of idx's you want, and then for each idx, send a query for Table 2 where owning_obj = current idx. This would mean sending a whole lot more queries.
Neither of these options seems very good, so I'm wondering if there is a better way. Currently I'm thinking it can be something that utilizes JSON_OBJECT(), but I'm not sure how.
This seems like a common situation, but I can't seem to find the exact wording to search for to get the answer.
PS: The server interfacing with MySql/MariaDb is written in Rust, don't think this is relevant in this question though
You can use GROUP_CONCAT to combine all the c values into a comma-separated string.
SELECT t1.idx, t1.a, t1.b, GROUP_CONCAT(entry) AS c
FROM table1 AS t1
LEFT JOIN table2 AS t2 ON t1.idx = t2.owning_obj
GROUP BY t1.idx
Then explode the string in PHP:
$result_array = [];
while ($row = $result->fetch_assoc()) {
$row['c'] = explode(',', $row['c']);
$result_array[] = $row;
}
However, if the entries can be long, make sure you increase group_concat_max_len.
If you're using MySQL 8.0 you can also use JSON_ARRAYAGG(). This will create a JSON array of the entry values, which you can convert to a PHP array using json_decode(). This is a little safer, since GROUP_CONCAT() will mess up if any of the values contain comma. You can change the separator, but you need a separator that will never be in any values. Unfortunately, this isn't in MariaDB.

Transpose survey response dataset with Open Refine (previously Google Refine)

I’m looking for some help to reshape a survey response dataset, exported as a csv, using Open Refine (previously Google Refine).
Some context on the survey
Collector and responder ID are collected in the background - ID1 ID2
Users select tasks from a long list - T{n}
Users enter a custom task - OT
Users rate the importance of the each selected task - R1
Users rate the satisfaction of the each selected task - R2
We have a total of 20 tasks atm but this might change.
Current dataset as follows:
ID1 | ID2 | T1 | » | T20 | OT | T1 R1 | » | T20 R1 | OT R1 | T1 R2 | » | T20 R2 | OT R2
123 | 789 |
I’m trying to reshape the dataset to the following format:
ID1 | ID2 | Task | Importance | Satisfaction
Here’s a gist of original and reshaped data sets
Also, i’ve tried to articulate how I want to reshape the data in a drawing, which might help
This can't be done by clicking a single button. You have to perform three "transpose cells across columns into rows" (one for tasks, one for their importance, one for their satisfaction), then three "join multivalued cells", then three "split multivalued cells", and finally use fill down to fill the blanks in the ID columns. A screencast will probably be clearer than my explanations.
You'll find the Json operations in a comment on your Gist. If your columns have exactly the same name as the example provided, you can apply it on your project by copying and pasting the file into "Undo/Redo -> Apply"
Try the following:
Concatenate all your content for each task using cells['Task1'].value+"|Importance: "+cells['Task Importance 1'].value+"|Satisfaction:"+cells['Task Satisfaction 1'].value You will need to do that 20 times (one for each group of task)
Transpose all column after Response ID (not included). You can reuse this Operation
split cells based on the pipe |
finish renaming and cleaning up value with value.replace()

Unique combinations of variables in Stata

I need assistance with getting a Stata code that can get me unique combinations of varibles. I have 7 variables and I need to run a code that can give me a unique combination of all of these variables. Every row will be a unique combination of all 7 variables.
An example:
V1: A, B, C
V2: 1, 2, 3
A1 A2 A3, B1 B2 B3, C1 C2 C3
Unique combination of all variables - total 9 combinations.
I have 15000 observations. I got a code in R but R won't get the output on a large data (memory error). I want to get this in Stata.
It is not especially clear what you want created or done. There is no code here, not even R code showing how what you want is done in R. There is no reproducible example.
You might want to check out egen, group(). (A previous answer to this effect from #Dimitriy V. Masterov, an experienced user of Stata, was twice incorrectly deleted as spam, presumably by people not knowing Stata.)
Alternatively, try installing groups from SSC.
UPDATE: The answer sounds more like fillin. For "unique" read "distinct".
Bit of a late response, but I just stumbled across this today. If I understand the question, Something like this should do the trick, although I'm not sure it's easily applied to more complex data or if this would even be the best way...
* Create Sample Data
clear
set obs 3
gen str var1 = "a" in 1
replace var1="b" in 2
replace var1="c" in 3
gen var2= _n
* Find number of Unique Groupings to set obs
by var1 var2, sort: gen groups=_n==1
keep if groups==1
drop groups
di _N^2
set obs 9
* Create New Variable
forvalues i = 4(3)9 {
forvalues j = 5(3)9 {
forvalues k = 6(3)9 {
replace var1="a" if _n==`i'
replace var1="b" if _n==`j'
replace var1="c" if _n==`k'
}
}
}
sort var1
egen i=seq(), f(1) t(3)
tostring i, replace
gen NewVar=var1+i
list NewVar
+--------+
| NewVar |
|--------|
1. | a1 |
2. | a2 |
3. | a3 |
4. | b1 |
5. | b2 |
|--------|
6. | b3 |
7. | c1 |
8. | c2 |
9. | c3 |
+--------+
Unfortunately as far as I know, there is no easy way to do this - it will require a fair amount of code. Although, I saw another answer or comment that mentioned cross which could be very useful here. Another command worth checking out is joinby. But even with either of these methods, you will have to split your data into 7 different sets based on the variables you want to 'cross combine'.
Anyway, Good Luck if you haven't yet found your solution.
If you just want the combination of that 7 variables, you can do it like this:
keep v1 v2 v3 v4 v5 v6 v7
duplicates drop
list
Then you will get the list of unique combinations of those 7 variables. You can save the file with different name from the original dataset. Please make sure that you do not save the dataset directly. Otherwise you will lose your original data.

How to delete a head?

I have pushed some files by mistake and it shows different heads in main repository. How can I delete that head?
You can enable the "mq" extension by editing your .hgrc file.
Make sure the following lines are present:
[extensions]
mq =
Afterwards you can "strip" a specific revision which deletes it so that you have only one head:
hg strip ...
I don't think that you actually want to delete the heads. If you do that, you will lose the work that was done in those branches.
You probably want to merge the heads back into one branch.
Say that you have a tree like this:
o 4 : Head 1
|
o 3 : Another commit
|
| o 2 : Head 2
| |
|/
o 1 : A commit
|
o 0 : Initial commit
To get rid of the additional head without losing the work contained within it you would merge the two heads (revisions 2 and 4 in this example) like this:
hg update 4
hg merge 2
hg commit -m "Merge"
That will create another commit which has all the changes in revisions 2, 3 and 4 in a single head like this:
o 5: Merge
|\
o | 4 : Head 1
| |
o | 3 : Another commit
| |
| o 2 : Head 2
| |
|/
o 1 : A commit
|
o 0 : Initial commit
This is standard procedure when multiple developers work on the same repository.

How to catch-up named mercurial branch from default branch without merging the two into one?

I have two branches in mercurial..
default named
|r1
|r2
|r3 -------- named branch created here.
| |r4
| |r5
| r6 |
| |r7
| |
-----------> | r8 How do I achieve this catch-up?
| |
I want to update the named branch from default, but I'm not ready to merge the branches yet. How do I achieve this?
Edit:
Additionally, what would the operation be using the GUI?
Is it.. right-click r6, merge with..., r8,... then what? commit to named branch?
hg merge default from your named branch.