New RTC stream from existing baseline / Truncate history - rational-team-concert

We're trying to migrate RTC to Git with tons of change sets which takes too long. While we can use a flavor of RTC2GIT with a baseline starting point, I also want to try a migration from a new RTC Stream that has been created without all the old history prior to a given baseline. We would choose our first baseline that is a few years old (several thousand changesets), as opposed to the 10+ years of change set history (100k+) that we have. It's a simplification because some activities are too slow in the tools.
To start with, how can we create a new RTC Stream from an existing stream's baseline without the prior changesets? I see that after selecting a component, I can use Replace with... and choose a baseline. That resets my workspace component to that baseline. So what is the most efficient way to create a new stream from that reset Workspace but without all the changesets (such that if we list the history there is none)?
After that starting point in the new stream, hopefully I could use the original stream as the flow target in my workspace to accept every changeset from the baseline forward to the current date. Then the new stream would have all changesets from the baseline to now, which is the desired goal. For reference see: Is there a way to create a RTC snapshot or baseline based on a past date?
Ideally it would be nice to use a starting date even older than our first baseline (few years old), to create the new stream but I don't know if that's possible?
The other way of asking this would be - is there a way to take a "stream" and truncate its changeset history (everything older than a certain point). But if the code is "composed" of the changeset delta's... that may preclude that.
Update 12/10/2018: Received a suggestion for removing the content of old change sets, to keep the database small which may help performance. The change set still exists as a history information, but the content can no longer be recovered:
https://jazz.net/library/article/1006

Related

Log File Viewer in HTML

I am about to implement a HTML based Log file viewer. The update volume varies from 1-10 updates per second
The server is WebSocket based and will be developed by me as well - I have built a Fleck based prototype and this side looks fine.
Is there any other smart HTML field besides a plain text field which
I could use for updating?
Would you recommend me to collect updates and work with a fixed
update interval?
I guess it would be more efficient to add the update interval in the server then, right?I am new to Java Script and HTML 5, so please do not be too harsh if these questions are trivial.
I am about to build a similar application and I therefore played around a little bit, comparing the performance of 1.) attaching DOM elements for every log row, 2.) attaching a table row for every log row, and 3.) using a textarea tag:
http://jsfiddle.net/PBzg5/18/
While removing all rows from the viewer is fastest with the textarea it takes longest to fill it. Also, there seems to be no faster method than manual string concatenation for textarea. Attaching elements to DOM (i.e. one text element and one <br> element per log row) is definitely fastest, with the table-based version being close behind. Also, using DOM elements will allow you to do more advanced things like coloring individual words than when using textareas. However, I haven't tested the performance influence of this yet.
When you implement your viewer be sure to keep in mind that browsers will actually brake down pretty fast when you try to display an unlimited number of rows. Therefore, just keep a certain number of the newest rows in a buffer (like terminals usually do it) and only display them.

Is there a workflow in Mercurial that enables many people to work on a merge (integration) together?

Say you have two “main-line” branches that have been developed separately for a long time, when you come to do the merge between them, you wish to split the work other all your developers.
E.g. you wish your C# programmer to merge the C# cope, while your TSQL programmer is merging the stored procs.
I wash all developers to be able to see what still needs to be merged and the results off each other merges, can Mercurial help with this?
[I am assuming there is more than one developer left after they have been told that they will have to do the merge!]
As I have been asked in the comments, this is how I think we got into this state…
If I understand the history correctly from before I joined. One large customer came along and said “we will pay you a lot of money if you add x,y, and z to your product, but we are not willing to take the risk of you giving us any changes that we don’t directly benefit from (they even put some of their own programmers on the team to check they were not getting other changes).
In the few years that the work was going on for this large customer, other customers said they would not buy the product if we did not add, a, b, and c”.
The large customer was not willing to pay for x,y,and z to be done in a way (or to be able to be turned off) that did not break the product for other customers that used it in different ways, and most of the programmers that understood the system where being sold to the large customer, so there was no man power (until now) to fix x,y,and z so they could be given to all our customers.
(Basically trying to build a product based on consulting income - the fact that we were brought by a company that is closely related to the large customer in the meantime just make the politics more complex.)
At the time all this started the product did not have much automated test coverage, the code base goes back to when .net v1 shipped and all the features are well integrated both in the UI and in the source code.
Hopefully history will not repeat itself, but it is very hard for programmers to say NO in a City that does not have many software companies. We are now moving to Mercurial, and I wish to know how we could cope with Mercurial if history did repeat itself! (I am also starting to question if Mercurial is all it is made out to be compared to Perforce)
Yes, there is a workflow for that scenario.
Plan ahead, know before you branch which branch will merge into which later on, and make sure you merge into it regularly.
Example: You have the default branch, and create a feature branch that at some point should merge back into default. Periodically, merge from default into your feature branch to ensure it is up to date with all changes in default. This creates smaller conflicts along the way. At the end, you do one last merge into your feature branch, fix any conflicts, and then merge it into default.
This avoids "The Big Merge" at the end and creates conflicts that are easier to manage.
One sensible approach could be to use hg convert to split the repository into smaller parts (C# part, TSQL part etc), perform merges on those smaller and more fine-grained repositories, and (if keeping original repo is of importance) just commit the results back onto original repository once they are ready.
Probably the closest to what you want to achieve is to merge one change-set at a time. This avoids the big bang merge by doing lot's of little merges. If you have a clear separation between your C# devs and TSQL devs (which is sounds like you do), then you should be able to allocate each change-set to the right group.
The big negative with this approach is that you won't be able to speed up the process by doing change-sets in parallel.
The big positive with this approach is that you can test after each merge (or a small set of merges), making it easier to track down the changes that break your application. If you don't currently have a strong automated test suite, I would strongly recommend beefing it up first.

How to define the version number of a software?

What is the best method to determine the version number I should use for a software or component? Is there a general rule to set version numbers?
I'm pretty sure it is a basic question but I didn't find anything useful after searching a while.
Microsoft have a convention of:
[major].[minor].[revision].[build]
Or follow Jeff's versioning system.
I've been doing this as an interim until I find a better solution. I don't build many large applications, mostly reports and smaller macros, but it's still important for me to keep track of changes and versions.
[Current year].[Current month].[Current day]
FileName 9.7.17.rpt for example.
It works for me and my boss, and it gives a value which you can compare to today's date to see how old the file is. I also keep a changelog.txt file in the same folder as the most current version and it keeps track of all the changes from the previous versions. I also keep track of all versions in a version control page on each projects tab in OneNote.
Thanks for the answer. I'll also throw in how I store the projects for giggles.
Every project gets its own folder. Inside that folder I'll have 4 main items that help me keep track of what's going on in the project.
An old versions folder
A folder for any reference material I might need for the project
The actual project file
And the changelog
That tree will look something like this.
Project X
Old versions
X Report 9.4.12.rpt
X Report 9.5.3.rpt
X Report 9.7.20.rpt
Reference
SQL calls.txt
Client list.txt
Procedures.doc
X Report 9.7.29.rpt
X Report changelog.txt
This way of keeping track of my work really cuts down on the amount of time that I need to spend documenting anything and organizes it in a standard way so if my boss needs to grab something I've worked on, even he knows exactly what everything means and where it is.
For storing multiple projects in my network folder I have these folders.
Inbox
Projects
#Archived Projects
Current Project 1
Current Project 2
Current Project 3
Reference
Inbox is where I toss random things to process later, or a folder where my boss can throw something I'm going to need for a later project. The Projects folder contains all the projects I'm currently working on, and then when I'm done or they no longer become a current priority, they get tossed in #Archived Projects. Reference is a folder for general job reference material, like policies and procedures, phone lists, org charts, fire escape plans. I may never use them, but it's comforting to have a place to put that kind of stuff as opposed to digging through old email.
This is a very common question. Are you sure you searched around? Wikipedia has a good article on software versioning.
Or, you can follow Ubuntu's convention of using year and month.
For example, release on April 2009 would be:
v9.04
Do it like Donald Knuth does with TeX---its version converges to π with each release and will in fact become π when he dies.
Since version 3, TeX has used an
idiosyncratic version numbering
system, where updates have been
indicated by adding an extra digit at
the end of the decimal, so that the
version number asymptotically
approaches π. This is a reflection of
the fact that TeX is now very stable,
and only minor updates are
anticipated. The current version of
TeX is 3.1415926; it was last updated
in March 2008.
from Wikipedia
A common scheme seems to be to use [major].[minor].[revision]. Where the major version number increments on large/major feature changes or rewrites (or stays 0 as long as you didn't reach a stable version, although many open source projects never get past 0 here), minor version number increases on minor changes, such as a collection of bugfixes, an added small feature and the like. revision increments with each build and reflects the smallest granularity of tracking your exact version. Things like small fixes, etc. get rolled into this, usually.
Usually the first number are major changes/major releases, the second number are used when minor features and bug fixes are added, and the third number is used for minor bug fixes and revision numbers.
Ex. 1.0.0
Depends on a lot of things.
If you are doing .Net work, you can have the system keep track of version numbers for your .dlls and .exe files automatically.
We frequently use the subversion revision as part of our version number. We use a system like:
major.minor.svn-version
We increment the major/minor manually based on internal decisions, and have the svn-version propagate to distinguish builds.
The most important thing is that version numbers make sense to your users.

What can I do to prevent write-write conflicts on a wiki-style website?

On a wiki-style website, what can I do to prevent or mitigate write-write conflicts while still allowing the site to run quickly and keeping the site easy to use?
The problem I foresee is this:
User A begins editing a file
User B begins editing the file
User A finishes editing the file
User B finishes editing the file, accidentally overwriting all of User A's edits
Here were some approaches I came up with:
Have some sort of check-out / check-in / locking system (although I don't know how to prevent people from keeping a file checked out "too long", and I don't want users to be frustrated by not being allowed to make an edit)
Have some sort of diff system that shows an other changes made when a user commits their changes and allows some sort of merge (but I'm worried this will hard to create and would make the site "too hard" to use)
Notify users of concurrent edits while they are making their changes (some sort of AJAX?)
Any other ways to go at this? Any examples of sites that implement this well?
Remember the version number (or ID) of the last change. Then read the entry before writing it and compare if this version is still the same.
In case of a conflict inform the user who was trying to write the entry which was changed in the meantime. Support him with a diff.
Most wikis do it this way. MediaWiki, Usemod, etc.
Three-way merging: The first thing to point out is that most concurrent edits, particularly on longer documents, are to different sections of the text. As a result, by noting which revision Users A and B acquired, we can do a three-way merge, as detailed by Bill Ritcher of Guiffy Software. A three-way merge can identify where the edits have been made from the original, and unless they clash it can silently merge both edits into a new article. Ideally, at this point carry out the merge and show User B the new document so that she can choose to further revise it.
Collision resolution:
This leaves you with the scenario when both editors have edited the same section. In this case, merge everything else and offer the text of the three versions to User B - that is, include the original - with either User A's version in the textbox or User B's. That choice depends on whether you think the default should be to accept the latest (the user just clicks Save to retain their version) or force the editor to edit twice to get their changes in (they have to re-apply their changes to editor A's version of the section).
Using three-way merging like this avoids lock-outs, which are very difficult to handle well on the web (how long do you let them have the lock?), and the aggravating 'you might want to look again' scenario, which only works well for forum-style responses. It also retains the post-respond style of the web.
If you want to Ajax it up a bit, dynamically 3-way merge User A's version into User B's version while they are editing it, and notify them. Now that would be impressive.
In Mediawiki, the server accepts the first change, and then when the second edit is saved a conflicts page comes up, and then the second person merges the two changes together. See Wikipedia: Help:Edit Conflicts
Using a locking mechanism will probably be the easiest to implement. Each article could have a lock field associated with it and a lock time. If the lock time exceeded some set value you'd consider the lock to be invalid and remove it when checking out the article for edit. You could also keep track of open locks and remove them on session close. You'd also need to implement some concurrency control in the database (autogenerated timestamps, perhaps) so that you could make sure that you are checking in an update to the version that you checked out, just in case two people were able to edit the article at the same time. Only the one with the correct version would be able successfully check in an edit.
You might also be able to find a difference engine that you could just use to construct differences, though displaying them in a wiki editor may be problematic -- actually displaying the differences is probably harder than constructing the diff. You'd rely on the versioning system to detect when you needed to reject an edit and perform a diff.
In Gmail, if we are writing a reply to a mail and someone else sends a reply while we are still typing it, a popup appears indicating that there is a new update and the update itself appears as another post without a page reload. This approach would suit your needs and if you can use Ajax to show the exact post with a link to diff of what was just updated while User B is still busy typing his entry that would be great.
As Ravi (and others) have said, you could use an AJAX approach and inform the user when another change is in progress. When an edit is submitted, just indicate the textual differences and let the second user work out how to merge the two versions.
However, I'd like to add on with something new you could try in addition to that: Open a chat dialog between the editors while they're doing their edits. You could use something like embedded Gabbly for that, for instance.
The best conflict resolution is direct dialog, I say.
Your problem (lost update) is solved best using Optimistic Concurrency Control.
One implementation is to add a version column in each editable entity of the system. On user edit you load the row and display the html form on the user. A hidden field gives the version, let's say 3. The update query needs to look something like:
update articles set ..., version=4 where id=14 and version=3;
If rows returned is 0 then someone has already updated article 14. All you need to do then is how to deal with the situation. Some common solutions:
last commit wins
first commit wins
merge conflicting updates
let the user decide
Instead of an incrementing version int/long you can use a timestamp but it's not suggested because:
retrieving the current time from the JVM isn't necessarily safe in a clustered environment, where nodes may not be time synchronized.
(quote from Java Persistence with Hibernate)
Some more info at the hibernate documentation.
At my office, we have a policy that all data tables contain 4 fields:
CreatedBy
CreatedDate
LastUpdateBy
LastUpdateDate
That way there is a nice audit trail on who has done what to the records, at least most recently.
But most importantly, it becomes easy enough to compare the LastUpdateDate of the current or edited record on the screen (requires you to store it on the page, in a cookie, whatever, with the value in the database. If the values don't match, you can decide what to do from there.

Should I "retire" the old trunk of a newly-rewritten project?

Recently I've been revisiting an old project, which I last worked on about two years ago. Obviously, during this time I've learned new habits about how best to program, and I've got the itch to keep the tests, scrap the implementation, and re-implement the entire project. It's not a large project, and I believe I'll not be losing much by re-writing it.
However, I don't know what to do about the version history. It's likely that when I'm done updating it, the new version will share only 3-4% of its code with the old version. Furthermore, the changes tend to be so wide-reaching that trying to maintain clean changesets is an exercise in frustration and futility. Given this, it seems unnecessary to force potential developers to download the old irrelevant versions.
One option I've been considering is to move the trunk to a branch, something like old-trunk/, and begin development in an empty branch. I don't know if this is a good idea, and I'm concerned that having two trunks could lead to confusion. Which brings me to the question:
What does SO think? If you encountered a project that had "reset" its trunk, would you be confused by it?
Why not just label/tag the trunk with "OldVersion" and continue development in the same location? This way you avoid the double branch altogether and still maintain ways to get to the old version of your code. Unless you're developing a different product, you likely want to keep the same trunk.
I tag the $trunk, so there is a "copy". Tag it the last version released, or the date before you started back on it, etc. I actually tag on the date released, and name my tags the actual date itself. (use Label inplace of tag if that is what your system supports).
After it is tagged, delete/rename/overhaul as needed. The version history is there, just in case. And, you have a complete copy labelled/tagged for archiving purposes.
Assuming you're using SVN, there's really nothing extra to do. Just remember the revision from which you started refactoring and then continue in working on the trunk. It's better than moving to an empty branch since the history of the changes from the old code to a new one will be recorded.
However, if you plan to write the thing from scratch and just copy some little bits of old code, maybe you should think about starting it in a new repository.
I don't see the need to throw the old version away. Just make all changes and check in. New developers will not have to download any old code, they will always just check out the new version.
No, it wouldn't confuse me because the project documentation would show that before I even got into it. We ALL have crap that needs to be redone, and frankly, should be.
Tell everyone it's insecure and not HIPAA compliant and they'll let you re-code it ;o)
And then start posting the fodder onto The Daily WTF for the rest of us 80))