Removing history using BFG Cleaner not working - bfg-repo-cleaner

Removing history of deleted folders using BFG
I am using BFG to delete history of deleted folders completely from repo
git rev-list --all --objects -- path/to/the/directory/to/delete | git cat-file --batch-check='%(objectname) %(objecttype) %(rest)' | grep -Pe '^\w+ blob' | cut -d' ' -f1 > ./to-delete.txt
java -jar bfg.jar --no-blob-protection --strip-blobs-with-ids ./to-delete.txt
I got the list of object ids but when i pass the object Ids to bfg using below command it throws error
java -jar bfg.jar --no-blob-protection --strip-blobs-with-ids ./to-delete.txt
Error:
java.exe : Error: Option --strip-blobs-with-ids failed when given 'RCC.txt'. MainException: class org.eclipse.jgit.errors.InvalidObjectIdException(**Invalid id: **??e 4 c 8 e 1 b b 1 7 8 2 4 8 7 1 9 2 9 9 b 0 1 5 b 1 5 0 8 3 9 2 7 b d e f f 5 b)
At G:\Dev_Migration_Scripts\RepoCleaner_New.ps1:29 char:35

Try bfg -B 1 --no-blob-protection
Read the BFG usage instructions and this guide.
git rm -r --cached . removes all cache from Git. This always fix my problems.
There's git reset --hard which leaves only committed files, and git clean -d -x -f that removes untracked files including directories (-d) and files ignored by git (-x).
Read about git rm here, about git clean here and about git reset here.
I know that the last ones are not from BFG, but they might help you.

I used the same technique as you and had the same issue with BFG raising the the InvalidObjectIdException for the blob hashes generated using git rev-list and git cat-file
The issue is with the text file's encoding, if you look at the invalid id **??e 4 c 8 e 1 b b 1 7 8 2 4 8 7 1 9 2 9 9 b 0 1 5 b 1 5 0 8 3 9 2 7 b d e f f 5 b you can see the extra characters, it would just be e4c8e1bb178248719299b015b15083927bdeff5b if the hash was correct but not found.
In my case I used Notepad++ for filtering the blobs I wanted to remove and for some reason the encoding switched to UCS-2 LE BOM. Switching back to UTF-8 fixed the issue.

Related

Mercurial command to delete older branch

This is my workflow
I have a branch that needs to get update.
So before working i do hg sync and that syncs with p4head but creates a new branch .
So once I do hg xl I see something like
#xxxxxx bugid: working changes(A)
|
|
|
|. xxxxx:bugidworking changes(c)
| /
| xxxxxx bugid:working changes(b)
o /
What I want to do is eliminate b and c but keep a
Is there a dry run that I can test?

How to use Bash to create arrays with values from the same line of many files?

I have a number of files (in the same folder) all with the same number of lines:
a.txt
20
3
10
15
15
b.txt
19
4
5
8
8
c.txt
2
4
9
21
5
Using Bash, I'd like to create an array of arrays that contain the value of each line in every file. So, line 1 from a.txt, b.txt, and c.txt. The same for lines 2 to 5, so that in the end it looks like:
[
[20, 19, 2],
[3, 4, 4],
...
[15, 8, 5]
]
Note: I messed up the formatting and wording. I've changed this now.
I'm actually using jq to get these lists in the first place, as they're originally specific values within a JSON file I download every X minutes. I used jq to get the values I needed into different files as I thought that would get me further, but now I'm not sure that was the way to go. If it helps, here is the original JSON file I download and start with.
I've looked at various questions that somewhat deal with this:
Creating an array from a text file in Bash
Bash Script to create a JSON file
JQ create json array using bash
Among others. But none of these deal with taking the value of the same line from various files. I don't know Bash well enough to do this and any help is greatly appreciated.
Here’s one approach:
$ jq -c -n '[$a,$b,$c] | transpose' --slurpfile a a.txt --slurpfile b b.txt --slurpfile c c.txt
Generalization to an arbitrary number of files
In the following, we'll assume that the files to be processed can be specified by *.txt in the current directory:
jq -n -c '
[reduce inputs as $i ({}; .[input_filename] += [$i]) | .[]]
| transpose' *.txt
Use paste to join the files, then read the input as raw text, splitting on the tabs inserted by paste:
$ paste a.txt b.txt c.txt | jq -Rc 'split("\t") | map(tonumber)'
[20,19,2]
[3,4,4]
[10,5,9]
[15,8,21]
[15,8,5]
If you want to gather the entire result into a single array, pipe it into another instance of jq in slurp mode. (There's probably a way to do it with a single invocation of jq, but this seems simpler.)
$ paste a.txt b.txt c.txt | jq -R 'split("\t") | map(tonumber)' | jq -sc
[[20,19,2],[3,4,4],[10,5,9],[15,8,21],[15,8,5]]
I could not come up with a simple way, but here's one I got to do this.
1. Join files and create CSV-like file
If your machine have join, you can create joined records from two files (like join command in SQL).
To do this, make sure your file is sorted.
The easiest way I think is just numbering each lines. This works as Primary ID in SQL.
$ cat a.txt | nl > a.txt.nl
$ cat b.txt | nl > b.txt.nl
$ cat c.txt | nl > c.txt.nl
Now you can join sorted files into one. Note that join can join only two files at once. This is why I piped output to next join.
$ join a.txt.nl b.txt.nl | join - c.txt.nl > conc.txt
now conc.txt is:
1 20 19 2
2 3 4 4
3 10 5 9
4 15 8 21
5 15 8 5
2. Create json from the CSV-like file
It seems little complicated.
jq -Rsn '
[inputs
| . / "\n"
| (.[] | select((. | length) > 0) | . / " ") as $input
| [$input[1], $input[2], $input[3] ] ]
' <conc.txt
Actually I do not know detailed syntex or usage of jq, it seems like doing:
split input file by \n
split a given line by space, then select valid data
put splitted records in appropriate location by their index
I used this question as a reference:
https://stackoverflow.com/a/44781106/10675437

How to find (latest) descendant revision of given ( possibly renamed ) file

Assuming a hg repository with arbitrary changes incl file renames/copies.
Some time in the past I picked an arbitrary file revision from repo and noted down it’s changeset/revision :
File A revision 2 changed with changeset aaabbbccceeefffggg
I now ( after several possible committed changes ) want to know which current file in my repo is descendant of the original noted file/revision.
For example the following file history ( incl. renames of A ) :
C tip ( renamed from B rev 7 )
B 7
B 6
B 5
B 4 ( renamed from A rev 3 )
A 3
A 2
A 1
Starting point of my problem is file A revision 2.
How do I traverse to C ( find out the path C and get revision of C too ) ?
Problem is hat A is currently not visible at all in my repo ( because it was renamed to something else ) :
hg log --follow A
abort: cannot follow file not in parent revision: "A"
Somehow I need a reversed “--follow”, i.e. going up the version history (future) instead of down (past).
Update to revision 2 and than call the log --follow command.
I did not tried it, but the message "cannot follow file not in parent revision" suggests that the file should be in the parent of the working directory.

Did the behavior of `hg backout` change since the hg book was written?

I created a new repository, test-backout, and added a new file in it, file. I then made 4 commits, each time, appending the number of the commit to file using
echo [manually entered number] >> file
hg commit -m '[manually entered number]'
In effect, file had:
init
1
2
3
According to the hg book, if I run hg backout --merge 2, I should have:
init
1
3
but instead, it fails to merge and opens up my difftool (vimdiff), and I get 3 options:
init | init | init
1 | 1 |
2 | |
3 | |
I initially tried it with the --merge option, then again without it. My question now is, is there still a way for me to get:
init
1
3
did I just make a mistake or miss something, or am I stuck with those options?
A big factor in why you got the 3-way merge is that your context is too artificial, and I will get to that.
If I take a 50-line text file and change a different part and commit each change, I won't have to resolve conflicts. And what I mean is I have 4 changesets: rev 0 adds the file, revs 1, 2, and 3 each change one area of the file: the beginning, middle, or end.
In this situation, when I do hg backout 2, it makes a reverse of rev 2 and merges those changes to my working directory, and when I commit, the graph is linear:
# backout 2
|
o 3
|
o 2
|
o 1
|
o initial
If I instead do hg backout 2 --merge, it automatically commits the backout as a child of the revision it is backing out, and then merges that with the tip, producing a branched graph after I commit the merge:
# merge
|\
| o backout 2
| |
o | 3
|/
o 2
|
o 1
|
o initial
In both situations, I didn't have to do any 3-way merging. The reason you don't automatically get
init
1
3
and instead have to do a 3-way merge is that the changes are too close together. The context and changes in each changeset are completely overlapped (default number of lines of context for a diff chunk is 3 lines, which encompasses the entire file still in your 4th changeset).
A similar example is if you had 3 changesets that each modified the same line. If you backed out the middle change like you're doing here, you would still be presented with a 3-way merge that you'll likely have to manually edit to get correct.
By the way, behavior did change in 1.7, as attested by hg help backout:
Before version 1.7, the behavior without --merge was equivalent to specifying --merge followed by "hg update --clean ." to cancel the merge and leave the child of REV as a head to be merged separately.
However, I don't think that's quite what you suspected.

Mercurial - should .hgtags be merged?

If you are merging changes from repository B into repository A should you merge changes in .hgtags?
Repository B could have had tags 1.01, 1.02, 1.03 which are not in A. Why would you ever merge those into repository A's .hgtags file? If we merged and then tried to view repository A by looking at tag 1.01, I would think this wouldn't work.
Short Answer: This does work properly and you should merge .hgtags
Why should you actually merge .hgtags and why does it make sense?
So you have
Repo A with Changesets 3 (a1), 4 (a2), 5 (a3)
Repo B with Changesets 3 (b1), 4 (b2), 5 (b3) tag 1.01
The above is listed as the Changeset Number (long unique hex id) tag
So you merge repo B into Repo A and get something that looks like.
9 (a4) merge
/ \
| 8 (b3) tag 1.01
| |
| 7 (b2)
| |
| 6 (b1)
5 (a3) |
| |
4 (a2) |
| |
3 (a1) |
\ /
2 (a0)
If you update the repo to tag 1.01 you will get exactly what the code looked like at that point in time When it was in Repo B just as mercurial promises.
You should merge them as the changesets from Repo B that were tagged are now part of the changeset tree in Repo A, so therefore the changesets you tagged in Repo B are now tagged in Repo A. Not merging them would just cause you to lose the tags that you created for the changesets.
An interesting thing to know (from the mercurial wiki)
The 'effective' tags are taken from
the .hgtags files on the heads of all
branches. Tags closest to tip take
precedence.
So when you merge (combine two heads), you need to merge the .hgtags or some tags will disappear.