Later this week we are doing some tests on an embedded device. For this, we want to know exactly what software was tested, but we also have to make some modifications to it for testing purposes. The current strategy we use is based on the popular "A successful Git branching model". What we want is
Not bloat the current structure.
Visually easy to distinguish what was tested and what has been changed since.
We are leaning towards an unnamed branch with a tag at its head as it keeps the named branches to releases and features, but is still easy to find visually.
What would be the drawbacks of this approach and what other approaches are suitable?
The drawback is similar to what a named branch offers: it's a permanent marker (though you can re-tag, thus move a tag).
However the better alternative is in this case to use a bookmark as that's by design a transient name which can be added or (re)moved to any revision you like at any time.
Related
Say you have two “main-line” branches that have been developed separately for a long time, when you come to do the merge between them, you wish to split the work other all your developers.
E.g. you wish your C# programmer to merge the C# cope, while your TSQL programmer is merging the stored procs.
I wash all developers to be able to see what still needs to be merged and the results off each other merges, can Mercurial help with this?
[I am assuming there is more than one developer left after they have been told that they will have to do the merge!]
As I have been asked in the comments, this is how I think we got into this state…
If I understand the history correctly from before I joined. One large customer came along and said “we will pay you a lot of money if you add x,y, and z to your product, but we are not willing to take the risk of you giving us any changes that we don’t directly benefit from (they even put some of their own programmers on the team to check they were not getting other changes).
In the few years that the work was going on for this large customer, other customers said they would not buy the product if we did not add, a, b, and c”.
The large customer was not willing to pay for x,y,and z to be done in a way (or to be able to be turned off) that did not break the product for other customers that used it in different ways, and most of the programmers that understood the system where being sold to the large customer, so there was no man power (until now) to fix x,y,and z so they could be given to all our customers.
(Basically trying to build a product based on consulting income - the fact that we were brought by a company that is closely related to the large customer in the meantime just make the politics more complex.)
At the time all this started the product did not have much automated test coverage, the code base goes back to when .net v1 shipped and all the features are well integrated both in the UI and in the source code.
Hopefully history will not repeat itself, but it is very hard for programmers to say NO in a City that does not have many software companies. We are now moving to Mercurial, and I wish to know how we could cope with Mercurial if history did repeat itself! (I am also starting to question if Mercurial is all it is made out to be compared to Perforce)
Yes, there is a workflow for that scenario.
Plan ahead, know before you branch which branch will merge into which later on, and make sure you merge into it regularly.
Example: You have the default branch, and create a feature branch that at some point should merge back into default. Periodically, merge from default into your feature branch to ensure it is up to date with all changes in default. This creates smaller conflicts along the way. At the end, you do one last merge into your feature branch, fix any conflicts, and then merge it into default.
This avoids "The Big Merge" at the end and creates conflicts that are easier to manage.
One sensible approach could be to use hg convert to split the repository into smaller parts (C# part, TSQL part etc), perform merges on those smaller and more fine-grained repositories, and (if keeping original repo is of importance) just commit the results back onto original repository once they are ready.
Probably the closest to what you want to achieve is to merge one change-set at a time. This avoids the big bang merge by doing lot's of little merges. If you have a clear separation between your C# devs and TSQL devs (which is sounds like you do), then you should be able to allocate each change-set to the right group.
The big negative with this approach is that you won't be able to speed up the process by doing change-sets in parallel.
The big positive with this approach is that you can test after each merge (or a small set of merges), making it easier to track down the changes that break your application. If you don't currently have a strong automated test suite, I would strongly recommend beefing it up first.
Very soon we are going to start on open-source (py+qt) project which is supposed to be multi-platform (we're using FreeBSD as native platform) and we're not sure which DVCS/hosting to use.
In the past we were using darcs for very long time, but moved away from it due to not having adequate public hosting available. Played for some time with Monotone - nicely designed, but mostly niche today. Fossil is nice, but it uses non-standard wiki and its tracker is functional, but we expect more.
Considering that we won't work on kernel-like sized project we do not nee Git which we consider too complex to deal with it, especially for potential contributors which might use Windows OS and prefer GUI tools.
So, the story comes to Bazaar/Launchpad and Mercurial/Bitbucket...
Here are some pro/cons which we gathered together, but would like to hear if we missed something which might help us to decide...
Bazaar pro/cons:
2.4 is probably quick enough for our needs ,
simple to use,
has nice GUI tool (explorer),
handles empty directories,
(probably) less popular than Mercurial,
does not have equivalent of hg's named branches
The last point is probably not to important 'cause there are nicks and there is colo-branches plugin, so one can get same/similar functionality.
The most problematic quirk we find in Bazaar it its revision numbers scheme and problem which can arise if one pushes from feature branch into upstream which would change revids.
Maybe it's a lesser problem when using Launchpad...
As far s Launchpad is concerned:
- it has very nice bug tracker with email interface
- it's (maybe) more project-oriented than Bitbucket
- no private repos as with Bitbucket
- no wiki for projects - bug (https://bugs.launchpad.net/launchpad/+bug/240067) is more than 3 years old and still with 'Low priority'. LP is the only one amongst {LP,Sourceforge,Bitbucket, Google, Github} which lacks this feature and it really sucks and degrades, otherwise, nice hosting solution.
What we've found in The other camp...
Mercurial is:
(probably) more popular than Bazaar,
quick,
simple to use,
there is nice TortoiseHG for non cli-savvy users,
we like named branches,
some quirks like handling empty directories (or https://www.mercurial-scm.org/bts/issue29)
However, what we like the most over Bazaar is, as we believe, great merging capabilities without the hassle of changed revids due to revno:hash schema.
As far as Bitbucket:
we like to have unlimited/private repos
we like having wikis available for the project(s)
we miss email interface for the tracker and the tracker is (maybe) not on par with the one at LP (reviews etc.)
At the end, let's say that there are some projects which we are interested in which are under Git #github, so we would like to use single DVCS which can helps us inter-operate with git#github projects.
We find that bzr-git plugin is very capable and do not have experience with hg-git.
Although there is bzr-hg plugin (not as mature as bzr-git), but we do not know about something like hg-bzr except hg's convert extension which does the job of hg-bzr conversion.
Is there any important feature which we did miss having important consequence in deciding about the two?
Finally, we use DVCS for all our needs (simple project, writings...) and we'd prefer to settle on one DVCS/hosting which can serve all our purposes and be useful in contributing to git(hub) projects as well.
What do you recommend?
In Bazaar:
You can avoid the problem of revision numbers being renumbered by setting append_revisions_only in branch.conf, which will make sure people only merge into trunk, rather than switching the trunk around.
I like bzr-colo a lot for dealing with named colocated branches.
I would certainly like to see Launchpad get wikis. It's assigned and in progress at the moment so perhaps it will get done soon.
Update: Seing this comment makes it easier for us to abandon bzr/LP and embrace hg/bitbucket.
Yesterday when i checked out the latest version of our internal tool i saw about 30+ new versions. This got me curious since i thought that somebody finally fixed those annoying bugs and added that feature i was waiting for so long... and guess what? None of this happened, someone just thought it would be nice to update some headers and do a minor adjustment of two or three functions. Everything in a separate commit. Great.
This raised a discussion in our team - should this be considered ok, or should we prohibit such "abuse"? Arguably this really could fit in one or two commits, but 30 seems to much. How should this be handled - what is the best practice?
You should be committing any time you make a change and are about to move on to the next one.
You shouldn't commit anything that stops the project from building.
You should be filling in the commit message so people know what changes have been made.
That'll do for me.. I don't assume something has been done unless I see it in the commit message...
Generally I think a commit should relate to one logical task, e.g. fixing bug #103 or adding a new print function. This could be one file or several, that way you can see all changes made for a particular task. It is also easier to roll back the change if necessary.
If each file is checked in one by one, it is not easy to see the changes made for a particular update / task.
Also if multiple tasks are completed in one commit, it is not easy to see what changes belong to which task.
I wouldn't care about the number of commits as each commit keeps project consistency (build will still succeed). This is some internal count that shouldn't bother you. If you want to change something here, better tell people to use some structured commit messages (like "[bugfix]...", "[feature]...", "[minorfix]").
By the way, if you want to know if bugs have been fixed or features have been added, using a bug tracing system is much better than checking commits in a SVN-like tool.
The battle against code entropy is an ongoing team effort. Minor checkins where one just 'fixes broken windows' along ones way should be encouraged, not frowned upon. The source repository is the wrong tool for keeping track of bugfixes - that's what a bug tracker is for - so the inconvenience in locating fixes when scanning the code repository and not the bug repository seems utterly negligible to me.
I work in a moderate size team on a large code base (~1M LOC) with a huge history (~20Y). A lot of the code is a pile of mess - rotten branch logic, deprecated API, naming conventions, even random indentation often makes it a misery to read. I started a habit of minor "drive-by" readability improvements, to try and fight complete code rot, and am trying hard to get teammates to adopt the same habit.
Unless your circumstances are radically different, I would try and think favorably on any such initiative. The alternative (which I'm familiar with all to well) is fearful stagnation, which dooms any code to rot.
Context: I work at a small software company that has traditionally done research-type work, and does not have much experience in the commercial space. We are now trying to push into the commercial world. Due to our origins in research we are used to a very rapid development cycle and very little structure in terms of maintaining proper versions of projects.
Problem: The lack of structure is now proving to be somewhat of a hindrance, as every developer has a slightly different view of the code base. A problem one developer discovers is not reproducible by another developer, and problems found in one build may disappear in the next (or worse, new problems may appear). This makes for a very frustrating experience for someone who is responsible for integrating all the projects and ensuring quality and performance standards are met - i.e. myself.
Potential solution: Personally I am convinced we need to enforce better structure via fixed version numbers and regular releases. It should be self-evident how proper versioning would help with many of our problems, but of course it is not without problems - developers need to do extra work to perform and test releases, and will no longer be able to use the latest versions of everything.
Question: To come to a point - what sorts of strategies do you recommend for ensuring the process and effort required for releases occurs as smoothly as possible? We are using git for version control, maven for our build system, and we have bug tracking and continuous integration systems running, so I believe the tools are there. I am simply unsure about what a proper release process should look like.
You have the big three in place: version control, one-click build via Maven and your continuous build server, and bug tracking. It sounds like you guys are gravitating towards Agile methodologies, and so you ought to be trying to keep the trunk version of your product in a near deliverable state at all times.
When you decide to make your first release, create a branch off of your trunk version for that release. Decide on a labelling scheme and be sure to label the branch version. For example, your first release could be 1.0.4530, where the 1 means first version, the 0 means it's the first release candidate, and the 4530 is the version control change number. You test this release branch and fix important bugs on it. After a while you issue another release candidate, say 1.1.4807. This process iterates a couple more times (say), your release becomes good enough, and you ship version 1.3.5167.
Meanwhile, your new development occurs only in the trunk version, and from time to time you'll need to merge bug fixes from the 1.x release branch back to the trunk. Later, you'll split off a 2.x branch from the trunk to repeat the process for your second release. You'll generally have several active branches (plus the trunk), with development limited to the trunk and each branch kept pristine and independent from development.
You guys will get the hang of things and your developer coordination problems will become less frequent. But these problems are nearly all going to be limited to the trunk, not the release branches.
A problem one developer discovers is
not reproducible by another developer,
and problems found in one build may
disappear in the next (or worse, new
problems may appear). This makes for a
very frustrating experience for
someone who is responsible for
integrating all the projects and
ensuring quality and performance
standards are met - i.e. myself.
Potential solution: Personally I am
convinced we need to enforce better
structure via fixed version numbers
and regular releases.
I don't think you need to have very frequent releases just to coordinate internally. You can do that through version control. Just have people talk about specific git revisions when reporting issues. Also note that you will have to coordinate any external dependencies/libraries too. Some kind of vendor branches could help with this.
It sound like the developers need to use "test branches" and respect the "stable/production branch" a little bit more.
Sell in the concept of "do your wild west stuff in this branch", and when you are happy with the results then you merge it into this "boring stable production branch"....
(or something like that)
There are books written about the general topic; Amazon search even returns three titles for specialized "version control with git."
I think you will benefit from defining a canonical view of the code base. Call it Test. A problem is a problem if it appears in Test. If a problem does not appear in some developer's view, it is up to that developer to figure out what is the important difference; and likewise for a problem that appears in a developer's view, but not in Test.
One convention is for Test to be re-built from sources on a nightly basis. A more strenuous convention is for Test to be re-built upon every update. If your team is small (five or fewer) and not dispersed over great distances or multiple timezones, a reasonable first approximation is to make Test a git workspace on a server upon which your toolchain has been installed along with some cron jobs so that this workspace is updated and rebuilt every night (usually).
I'm currently trying to evaluate Mercurial, to get a feel for the philosophy the system tries to promote - but one thing that's got me confused is the presence of the bundled 'extensions' and how they fit into the mix.
In the core package, Mercurial ships with a bunch of functionality that is implemented as extensions but is disabled by default. (See: https://www.mercurial-scm.org/wiki/UsingExtensions#Extensions_Bundled_with_Mercurial)
Here's the thing I'm confused about:
Are these extensions considered first class citizens by the Mercurial dev team and therefore part of the overall Mercurial approach to DVCS?
Why are they implemented outside of the default features and disabled by default?
I don't need info on how activate extensions, that's pretty straight forward - it's the logic behind the separation that I'm curious about.
The reason I'm trying to get my head around this is because I don't really want to try and crowbar an opposing approach into Mercurial via extensions if it differs from the overall philosophy of the project.
Are these extensions considered first class citizens by the Mercurial dev team and therefore part of the overall Mercurial approach to DVCS?
Yes, although we won't generally advocate their use to new users, they are very useful for advanced usage. I guess everybody in the dev team has extension enabled (at least mq, patchbomb, and sometimes record).
Extension accepted in hgext/ are reviewed priori to inclusion, and we generally require them to provides tests. But they are often owned by outside contributors and aren't updated by the dev team except for API changes within core hg.
Why are they implemented outside of the default features and disabled by default?
We generally think that hg should stay simple and adding more commands might confuse users (e.g. if you have a simple workflow you don't need to learn about mq). But if a command is deemed useful for most users, it can migrate from an extension into core (that was the case for bisect, and it is the case of the subrepo functionality).
Almost immediately after posting, I learnt about the following hg command:
hg help extensions
This contained some information that I don't think is available in the Mercurial help docs:
Extensions are not loaded by default for a variety of reasons: they can increase startup overhead; they may be meant for advanced usage only; they may provide potentially dangerous abilities (such as letting you destroy or modify history); they might not be ready for prime time; or they may alter some usual behaviors of stock Mercurial. It is thus up to the user to activate extensions as needed.
This helps answer part of my question.