Do Mercurial bundle files support internal integrity checks? - mercurial

I am working on a project with developers around the globe and we are using mercurial for our source control solution. Currently, we communicate our change sets by creating bundles and posting to a mailing list. A disagreement has arisen about best practices, and we have not been able to find an answer in the mercurial documentation.
When creating a bundle, is there any sort of internal integrity check that occurs? Or should we be sending digests along with the change set to ensure file integrity?

A bundle contains exactly the same data as the data transferred by the wire protocol. Due to the way mercurial works, there's a recursive hashing scheme going on, so every revision must be uncorrupted in order to be used.

Related

How to manage JSON-Schemas for multiple projects?

Suppose you have a Schema that is used in a UI-App (e.g. Vue), a Node.js or Springboot Server and has to validate against Databases (e.g. SQL, mongoDB,...), and maybe some Micro-services running on whatever.
How and where do I manage a this JSON-Schema, so that if I have to change the schema for whatever Reason, that every architectural component can handle the new JSON-Schema(s).
Otherwise I need to update the Schema in up to 10 projects so none is incompatible.
Is it really as simple as having a git project full with just JSON-Schemas or do I need specific loaders for each language/environment?
Are there best practices that I am unaware of?
PS: I don't really think I need the automatically synchronized on runtime, so don't really think I need another Microservice to achieve that.
That being said, if a Microservice is the best way to go, then getting a Microservice it is.
If you keep them in a git project, how do you load them? Clone the project each time the app starts? It may work, but I would go with a more flexible approach that should take too much effort to be done:
Build a JSON schema repository accessible via a REST API
When the app starts, it makes a request to grab the schema (latest, or a specific version)
That way you get an uniform (and scalable) way of playing with the schemas. Even if you think about a hot-reload sometime in the future, you can leverage this approach to do that.
Here is an old project in this direction, you may give it a shot to see if it works (or for some inspiration, at least)

What is "vendoring"?

What is "vendoring" exactly? How would you define this term?
Does it mean the same thing in different programming languages? Conceptually speaking, not looking at the exact implementation.
Based on this answer
Defined here for Go as:
Vendoring is the act of making your own copy of the 3rd party packages
your project is using. Those copies are traditionally placed inside
each project and then saved in the project repository.
The context of this answer is in the Go language, but the concept still applies.
If your app depends on certain third-party code to be available you could declare a dependency and let your build system install the dependency for you.
If however the source of the third-party code is not very stable you could "vendor" that code. You take the third-party code and add it to your application in a more or less isolated way. If you take this isolation seriously you should "release" this code internally to your organization/working environment.
Another reason for vendoring is if you want to use certain third-party code but you want to change it a little bit (a fork in other words). You can copy the code, change it, release it internally and then let your build system install this piece of code.
Vendoring means putting a dependency into you project folder (vs. depending on it globally) AND committing it to the repo.
For example, running cp /usr/local/bin/node ~/yourproject/vendor/node & committing it to the repo would "vendor" the Node.js binary – all devs on the project would use this exact version. This is not commonly done for node itself but e.g. Yarn 2 ("Berry") is used like this (and only like this; they don't even install the binary globally).
The committing act is important. As an example, node_modules are already installed in your project but only committing them makes them "vendored". Almost nobody does that for node_modules but e.g. PnP + Zero Installs of Yarn 2 are actually built around vendoring – you commit .yarn/cache with many ZIP files into the repo.
"Vendoring" inherently brings tradeoffs between repo size (longer clone times, more data transferred, local storage requirements etc.) and reliability / reproducibility of installs.
Summarizing other, (too?) long answers:
Vendoring is hard-coding the often forked version of a dependency.
This typically involves static linking or some other copy but it doesn't have to.
Right or wrong, the term "hard-coding" has an old and bad reputation. So you won't find it near projects openly vendoring, however I can't think of a more accurate term.
As far as I know the term comes from Ruby on Rails.
It describes a convention to keep a snapshot of the full set of dependencies in source control, in directories that contain package name and version number.
The earliest occurrence of vendor as a verb I found is the vendor everything post on err the blog (2007, a bit before the author co-founded GitHub). That post explains the motivation and how to add dependencies. As far as I understand the code and commands, there was no special tool support for calling the directory vendor at that time (patches and code snippets were floating around).
The err blog post links to earlier ones with the same convention, like this fairly minimal way to add vendor subdirectories to the Rails import path (2006).
Earlier articles referenced from the err blog, like this one (2005), seemed to use the lib directory, which didn't make the distinction between own code and untouched snapshots of dependencies.
The goal of vendoring is more reproducibility, better deployment, the kind of things people currently use containers for; as well as better transparency through source control.
Other languages seem to have picked up the concept as is; one related concept is lockfiles, which define the same set of dependencies in a more compact form, involving hashes and remote package repositories. Lockfiles can be used to recreate the vendor directory and detect any alterations. The lockfile concept may have come from the Ruby gems community, but don't quote me on that.
The solution we’ve come up with is to throw every Ruby dependency in vendor. Everything. Savvy? Everyone is always on the same page: we don’t have to worry about who has what version of which gem. (we know) We don’t have to worry about getting everyone to update a gem. (we just do it once) We don’t have to worry about breaking the build with our libraries. […]
The goal here is simple: always get everyone, especially your production environment, on the same page. You don’t want to guess at which gems everyone does and does not have. Right.
There’s another point lurking subtlety in the background: once all your gems are under version control, you can (probably) get your app up and running at any point of its existence without fuss. You can also see, quite easily, which versions of what gems you were using when. A real history.

Mercurial and NTFS Alternate data stream

How does Mercurial handle Alternate Data Streams (in the NTFS file system)? If it can't handle that, is there a DCVS that does?
EDIT: When I change version with update, what happens to the ADS ? Is it lost (erased)? Is it versioned too? Is it alltogether ignored?
Mercurial does not store alternate data stream. Additionally, they are likely to be overwritten on update.
I don't think any of the open source VCS I know handle that kind of thing (even permissions are usually not handled).

Mercurial command-line "API" reference?

I'm working on a Mercurial GUI client that interacts with hg.exe through the command line (the preferred high-level API, as I understand it).
However, I am having trouble determining the possible outputs of each command. I can see several outputs by simulating situations, but I was wondering if there is a complete reference of the possible outputs for each command.
For instance, for the command hg fetch, some possible outputs are:
pulling from https://User#server.com/Repo
searching for changes
no changes found
if there are no changes, or:
abort: outstanding uncommitted changes
or one of several other messages, depending on the situation.
I would like to structure my program to handle as many of these cases as possible, but it's hard for me to know in advance what they all are.
Is there a documented reference for the command-line? I have not been able to find one with The Google.
Look through the translation strings file. Then you'll know you have every message handled and be able to see what parts of it vary.
Also, fetch is just a convenience wrapper around pull/update/merge. If you're invoking mercurial programmatically you probably want to keep those three very different concepts separate in your running it so you know which part failed. In your example above it's the 'update' failing, so the 'pull' would have succeeded and the 'update's failing would allow you to provide the user with a better message.
(fetch is an abomination, which is part of why it's disabled by default)
Is this what you were looking for: https://www.mercurial-scm.org/wiki/MercurialBook ?
Mercurial 1.9 brings a command server, a stable (in a sense that API doesn't change that much) and low overhead (there is no need to run hg process for every command). The communication is done via a pipe.

Best practices for version information?

I am currently working on automating/improving the release process for packaging my shop's entire product. Currently the product is a combination of:
Java server-side codebase
XML configuration and application files
Shell and batch scripts for administrators
Statically served HTML pages
and some other stuff, but that's most of it
All or most of which have various versioning information contained in them, used for varying purposes. Part of the release packaging process involves doing a lot of finding, grep'ing and sed'ing (in scripts) to update the information. This glue that packages the product seems to have been cobbled together in an organic, just-in-time manner, and is pretty horrible to maintain. For example, some Java methods create Date objects for the time of release, the arguments for which are updated by a textual replacement, without compiler validation... just, urgh.
I'm trying avoid giving examples of actual software used (i.e. CVS, SVN, ant, etc.) because I'd like to avoid the "use xyz's feature to do this" and concentrate more on general practices. I'd like to blame shoddy design for the problem, but if I had to start again, still using varying technologies, I'd be unsure how best to go about handling this, beyond laying down conventions.
My questions is, are there any best practices or hints and tips for maintaining and updating versioning information across different technologies, filetypes, platforms and version control systems?
Create a properties file that contains the version number and have all of the different components reference the properties file
Java files can reference the properties through
XML can use includes?
HTML can use a JavaScript to write the version number from the properties in the HTML
Shell scripts can read in the file
Indeed, to complete Craig Angus's answer, the rule of thumb here should be to not include any meta-informations in your normal delivery files, but to report those meta-data (version number, release date, and so on) into one special file -- included in the release --.
That helps when you use one VCS (Version Control System) tool from the development to homologation to pre-production.
That means whenever you load a workspace (either for developing, or for testing or for preparing a release into production), it is the versionning tool which gives you all the details.
When you prepare a delivery (a set of packaged files), you should ask that VCS tool about every meta-information you want to keep, and write them in a special file itself included into the said set of files.
That delivery should be packaged in an external directory (outside any workspace) and:
copied to a shared directory (or a maven repository) if it is a non-official release (but just a quick packaging for helping the team next door who is waiting for your delivery). That way you can make 10 or 20 delivers a day, it does not matter: they are easily disposable.
imported into the VCS in order to serve as official deliveries, and in order to be deployed easily since all you need is to ask the versionning tool for the right version of the right deliver, and you can begin to deploy it.
Note: I just described a release management process mostly used for many inter-dependant projects. For one small single project, you can skip the import in the VCS tool and store your deliveries elsewhere.
In addition to Craig Angus' ones include the version of tools used.