getting git log output in prettyformat

getting git log output in prettyformat - json

I have this command:git log --all --pretty=format:'%n{%n "CommitHash": "%H",%n "Author": "%an",%n "AuthorEmail": "%ae",%n "Date": "%ad",%n "Message": "%f",%n},'
Ho can I modify it to get this: "Merge":"....." and this: Merge branch 'master' of ..........
Those 2 things appear when you use the git log --all --graph command, but I trying to put the info that I would get with that command in the pretty format that I put above, and until now I could get everything except those 2 things.

You cannot get the "Merge":"....." with a format: it's simply not available that way.
You can get all the parent hash IDs using %p (abbreviated parent hash IDs) or %P (full parent hash IDs). Note, however, that you will get parent hash IDs of non-merge commits as well. The difference is, of course, that if there are two or more parent hash IDs—these will be separated by spaces—the commit in question is a merge commit.
The Merge branch 'master' of <url> text is simply the body of the commit message, in a merge commit in which whoever made the merge let git pull dictate the body contents. This is available via %b (body only) or %B (subject plus body). Once again, you will get this for all commits, not just merge commits.
If you are attempting to produce valid JSON from arbitrary commits (including message bodies), you should not attempt this solely with --pretty=format:... directives, but rather with an external program that can make any necessary changes to the message-body text so that it does not disrupt the JSON stream. For instance, a commit message body that contains a double quote or a newline will be a problem.

Related

Mercurial: Most recent change per file

I'm looking for a way to make Mercurial output a table like this:
File Most recent revision changing the file Date of that revision
==== ====================================== =====================
foo.py 44159adb0312 2018-09-16 12:24
... ... ...
This is just like github does it on the "Code" overview page. (screenshot from torvalds/linux):
"Most recent" could refer the date or to the DAG hierarchy relative to the current changeset, or maybe to the current branch. Perhaps the latter is more useful, but in my particular use case, it doesn't make a difference.
I'd also like to be able to provide a list of files or a subdirectory for which I want the table. (I don't necessarily want it for everything)
I am aware that I could do it using a small script, looping over hg log -l 1 <file>, but I was wondering if there is a more efficient / more natural solution.

You won't get around looping over all files. Yet with hg manifest you get that list of files. Then template the output as needed:
for f in $(hg ma); do hg log -l1 $f -T"$f\t\t{rev}:{node|short}\t\t{date|isodate}"; done
This gives output like
.hgignore 38289:f9c426385853 2018-06-09 13:34 +0900
.hgsigs 38289:f9c426385853 2018-06-09 13:34 +0900
.hgtags 38289:f9c426385853 2018-06-09 13:34 +0900
You might want to twiddle more with the output formatting. See the mercurial wiki for a complete overview of output templating.

Git will follow the commit DAG, because that's all it has. In Mercurial, you have (many) more options because you have more data.
Probably the ideal option here is follow(file, .) (combined with first or last as appropriate). But as hg help revset will tell you, you have the following options (I've shrunk the list to the obvious applicable ones):
ancestors(set[, depth])
Use this with the set being . to get ancestors of the current commit, for instance, if you want to do DAG-following a la Git. Or, use ::., which is basically the same.
branch(string or set)
Use this with . to get all commits in the current branch. Combine with other restrictors (e.g., parents) to avoid looking at later commits in the current branch if you're not at the tip of the current branch.
file(pattern)
Use this with a glob pattern to find changesets that affect a given file.
filelog(pattern)
Like file but faster, trading off some accuracy for speed (see documentation for further details).
follow([file[, startrev]])
To quote the documentation:
An alias for "::." (ancestors of the working directory's first parent).
If file pattern is specified, the histories of files matching given
pattern in the revision given by startrev are followed, including
copies.
modifies(pattern)
Use this (with any pattern, not just glob) to find changesets that modify some file or directory. I think this is limited to M type modifications, not addition or removal of files, as there is also adds(pattern) and removes(pattern). Use all three, or-ed together, to find any add/modify/remove operations.
first(set, [n])
last(set, [n])
limit(set[, n[, offset]])
Use this to extract a particular entry out of the revset.
When searching forwards (the default), last(follow(file, .)) seems to work nicely to locate the correct revision. As you noted, you have to do this once per file—it will definitely go faster if you write your own Mercurial plug-in to do this without reloading the rest of the system all the time.

Somehow more efficient / more natural solution can be:
create template|style for desired log output (I can't predict, which way will be better for you)
create alias for hg log -l 1 --template ... or hg log -l 1 --style ...
EDIT
A lot later, more correct solution (from recent discoveries) with hg grep
hg grep "." "set:**.py" --files-with-matches -d -q -T"{files % '{file} {date|age}\n'}"
Part of output in test-repo
hggit/__init__.py 7 weeks ago
hggit/git_handler.py 7 weeks ago
hggit/gitdirstate.py 7 weeks ago
…
You have to modify fileset in order to get results only for part of your tree (for all branches) and, maybe, template in order to fulfill your needs.
I didn't have fileset for selecting "files in branch X" just now, I think, it will be something using revs() predicate
"revs(revs, pattern)"
Evaluate set in the specified revisions. If the
revset match multiple revs, this will return file matching pattern in
any of the revision.
because some not published predicates (according to examples, see # "set:revs('wdir()'..." for referencing working directory) can be used for defining revset and I can't discover/predict the correct form for branch predicate

Git log with Json hierarchy

The problem is that I have this git command
git log --pretty=format:'%n{%n%d%n "CommitHash": "%H",%n "Author":
"%an",%n "AuthorEmail": "%ae",%n "Date": "%ad",%n "Message": "%f"%n}'
`
with it, I get a log with a JSON format, but I need to get the branches as fathers and the commits names as children, and those commits names must be fathers they respective info(author, date, email, etc.....)
The log output should be something like this:
[
"Branch or Merge Name":"The Branch or Merge Name"[
"Commit Name":"The Commit Name"{
The commit info......
}
]
]

I doubt this would be easy to do without a script, considering a commit can be part of multiple branches.
That means for any commit of your list, there is not "one father", but possibly multiple ones.
Reversing the model, and having for each commit, as a child, the list of branches each commit is part of, would make more sense.

Look at this tool http://bit-booster.com/graph.html they pass the log to you in git format log --pretty = '% h |% p |% d' I'm trying to do apache echart.
--pretty = "% H,% P,% D"
% H expands to show the commitId.
% P expands to show the parent commitIds.
% D expands to show the decorations (tags and branches).
But there are a few subtle problems with it:
% P will expand to all of% H's parent commits (separated by space), and so you'll need to run the output through a 2nd script to better normalize that into a format suitable for D3.
% P might expand to 3 or more commits (very rare). These are called octopus merges!
% D expands to a comma separated list of decorations (branch and tag labels), and there's no limit on how many branches and tags a single commit might have.

How to update to a branch by name when there's a hash collision?

So my coworker just won the hash lottery. We create a feature branch for every ticket we resolve, following the convention b##### where ##### is the issue number.
The trouble is that when he tried updating to that branch (before it existed) via hg up branch(b29477), it took him to default instead of saying that the branch doesn't exist.
It turns out that branch(b29477) actually returns the name of branch of the thing inside the parens (instead of forcing Mercurial to evaluate the thing inside the parens as a branch name as I thought!), and there so happened to be a changeset beginning with b29477 which was on default, so instead of saying the branch didn't exist, it took him to the tip of default!
Now we can work around this problem by choosing a different branch name, but I want to know if there's any way to hg update <branch_name_and_dont_interpret_this_as_anything_else>?
BTW, hg log also lies about what it's --branch parameter does. It says:
-b --branch BRANCH [+] show changesets within the given named branch
But that's not true at all. Go ahead and run it with a hash. e.g.,
hg log --branch eea844fb
And it will turn up results. If you dig through the docs, you'll discover that it's actually the same as:
hg log -r 'branch(eea844fb)'

Try this:
hg update -r "branch('literal:b29477')"
From the Mercurial help page:
branch(string or set)
All changesets belonging to the given branch or
the branches of the given changesets.
If string starts with re:, the remainder of the name is treated as a
regular expression. To match a branch that actually starts with re:,
use the prefix literal:.
This means that if you use the literal prefix, you are specifying a string. And a string is not a set.
As the text says, if you specify a changeset, Mercurial will show:
the branches of the given changesets

ISO Mercurial "attributes" - tags that apply to more than one changeset, applied after the changeset

BRIEF
How do I tag multiple changesets in Mercurial with the same tag? Possibly on the same branch, possibly on different branches.
E.g. something like a tag that says whether the full QA test ran.
I can create multiple instances of a tag by editing the file, but the hg tools nearly always ignore all but the first.
DETAIL
I am looking for what I call changeset "attributes" - a concept that I have used in other CVS and DVCS, but which I cannot seem to find in Mercurial.
Basically, an attribute is very much like a tag, but where a tag is only supposed to refer to a single changeset, an attribute may apply to multiple changesets.
Q: does anyone know how to do this?
Similarly: is there a way to attach a description to a ChangeSet, after the changeset has been created. Note that i do not want to rewrite history: I do not want to delete or change or replace the original checkin message. I just want to add some more stuff - and have that more stuff appear in queries like hg log. E.g. "I forgot to add a file to commit df..a3 - look instead to commit 8f..77 where I checked in the missing files.
EXCRUCIATING DETAIL
I know - you can do hg tag -f to force a tag to apply to more than one changeset. But so many other hg tag related features really only work with a single changesrty per tag. Or at least only one changeset per line of descent - i.e. per head.
So you can leave a tag defined forever and ever. I like placing the date or other context in such a tag - e.g. tests-pass-2012-01-14.
Or you can have a "floating tag", that moves upwards - e.g. "most recent rev where all the slow tests pass", which I might call simply "tests-pass".
(By the way, you may apply such attributes or tags after the checkin - especially if you have a slow QA process, perhaps a quick smoke test, followed by a slower full set of tests that may take a week to complete. So you checkin, and then, later, go back and apply the attribute, the uniqified dated tag. And you may later need to go back and modify such a tag, e.g. if more tests are added, so that a changeset that used to pass all tests no longer does. E.g. all-tests-pass-2012-01-14 and all-tests-pass-2012-01-15 may apply to the same changeset.)
But it is onerous to have to uniqify such fixed tags. Hence what I call an attribute: a tag that applies to multiple changesets, Which is version controlled. So you might apply all-tests-pass to rev 105, and then later to 106 and 107. But then you realize that new tests fail on 106, so you repally.
Then the attribute history might look like
105:
tagged all-tests-pass on 2012-01-14-10h00 (in changeset XXX)
tagged all-tests-pass on 2012-01-15-10h00 (in changeset YYY)
106:
tagged all-tests-pass on 2012-01-14-13h00 (in changeset XXX)
tagged not-all-tests-pass on 2012-01-15-13h00 (in changeset YYY)
107:
tagged all-tests-pass on 2012-01-14-14h00 (in changeset XXX)
tagged all-tests-pass on 2012-01-15-10h00 (in changeset YYY)
and a revset query like
105::107 and current_attribute_tag(all-tests-pass)
= returns 105 and 107 on the latest, at or after YYY is in the repo
= but returns 105, 106, 107 if cloned so as not to include YYY
while
105::107 and attribute_tag_at_any_time(all-tests-pass)
=returns 105 106 and 107 at any time if the repo holds XXX
===
I would like to be able to do things like
run hg bisect, but only on changesets tagged tests-pass.
exclude certain log messages from hg log and glog
etc.
===
By the way, I reject phases and bookmarks for this purpose because they are not version controlled. And I want these attribute tags to be VCed, so that I can follow something like to ebb and flow of all-tests-pass, as mentioned above.
branches are almost what I want, because Mercurial branches are really changeset attributes, not branches. But I don't think that the branch associated with a changeset can be changed after the commit.
(I really wish that you could switch changesets to a branch after you commit them. I call this wished for feature "retroactive branching".)
===
Here's a classic example of why I might want attributes: have you ever forgotten to add a file to the VCS? And then have a changeset that fails to build? Add the file in a subsequent changest? ...
I would like to be able to retroactively mark a changeset as will-not-build--missing-files. And then have bisect not even bother looking at such changesets.

Have a look at this: Custom revision properties in Mercurial?
There's no native support for attributes. You could write an extension (there's a dictionary of extra properties that get saved with a changeset).
Or you could hack it together with multiple changesets per tag (as you suggested).
Or you could hack it together with a new all-tests-pass branch, and have your CI server merge to that branch when all tests pass (then bisects are being tip of default and tip of all-tests-pass).
But the short answer remains that there's no existing native way to do it.

Script to adjust history in an RCS/CVS ,v file

In preparation for a migration to Mercurial, I would like to make some systematic changes to many thousands of ,v files. (I'll be editing copies of the originals, I hasten to add.)
Examples of the sorts of changes I'm after:
For each revision whose message begins with some text that indicates a known username (e.g. [Fred Bloggs]), if the username in the comment matches the Author in the ,v file, then delete the unnecessary username text from the commit message
If the ,v contains a useful description, append it to the commit message for revision 1.1 (cvs2hg ignores the description - but lots of our CVS files actually came from RCS, where it was easy to put the initial commit message into the description field by mistake)
For edits made from certain shared user accounts, adjust the author, depending on the contents of the commit message.
Things I've considered:
Running 'cvs log' on each individual ,v file - parsing the output, and using rcs -m to change this history. Problems with this include:
there doesn't seem to be a way to pass a text file to rcs -m - so if the revision message contained singled and/or or double quotes, or spanned multiple lines, it would be quite a challenge quoting it correctly in the script
I can't see an rcs or cvs facility to change the author name associated with a revision
less importantly, it would be likely to start a huge number of processes - which I think could get slow
Writing Python to parse the ,v file, and adjust the contents. Problems with this include:
we have a mixture of line-endings in our ,v files - including some binary files that should have been text, and vice-versa - so great care would be needed to not corrupt the files
care would be needed for quoting of the # character in any commit messages, if it fell on the start of the line in a multi-line comment
care would also be needed on revisions where the last line of the committed file was changed, and doesn't have a newline - meaning that the ,v has a # at the very end of a line, instead of being preceded by \n
Clone the version of cvs2hg that we are using, and try to adjust its code to make the desired edits in-place
Are there any other approaches that would be less work, or any existing code that implements this kind of functionality?

Your first approach may be the best one. I know that in Perl, handling quotation marks and multiple lines wouldn't be a problem. For example:
my $revision = ...;
my $log_message = ...;
system('rcs', "-m$revision:$log_message", $filename);
where $log_message can contain any arbitrary text. Since the string doesn't go through the shell, newlines and other metacharacters won't be reinterpreted. I'm sure you can do the same thing in Python.
(As for your second approach, I wouldn't expect line endings to be a problem. If you have Unix-style \n endings and Windows-style \r\n endings, you can just treat the trailing \r as part of the line, and everything should stay consistent. I'm making some assumptions here about the layout of ,v files.)

I wrote a Python library, EditRCS (PyPi) that implements the RCS format so the user can load an RCS file as a tree of Python objects, modify it programmatically and save to a new RCS file.
You can apply a function to every revision using mapDeltas(), for example to change an author's name; or walk the tree using getNext() for something more complicated such as joining two file histories together.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008