What is "vendoring"? - language-agnostic

What is "vendoring"? - language-agnostic

What is "vendoring" exactly? How would you define this term?
Does it mean the same thing in different programming languages? Conceptually speaking, not looking at the exact implementation.

Based on this answer
Defined here for Go as:
Vendoring is the act of making your own copy of the 3rd party packages
your project is using. Those copies are traditionally placed inside
each project and then saved in the project repository.
The context of this answer is in the Go language, but the concept still applies.

If your app depends on certain third-party code to be available you could declare a dependency and let your build system install the dependency for you.
If however the source of the third-party code is not very stable you could "vendor" that code. You take the third-party code and add it to your application in a more or less isolated way. If you take this isolation seriously you should "release" this code internally to your organization/working environment.
Another reason for vendoring is if you want to use certain third-party code but you want to change it a little bit (a fork in other words). You can copy the code, change it, release it internally and then let your build system install this piece of code.

Vendoring means putting a dependency into you project folder (vs. depending on it globally) AND committing it to the repo.
For example, running cp /usr/local/bin/node ~/yourproject/vendor/node & committing it to the repo would "vendor" the Node.js binary – all devs on the project would use this exact version. This is not commonly done for node itself but e.g. Yarn 2 ("Berry") is used like this (and only like this; they don't even install the binary globally).
The committing act is important. As an example, node_modules are already installed in your project but only committing them makes them "vendored". Almost nobody does that for node_modules but e.g. PnP + Zero Installs of Yarn 2 are actually built around vendoring – you commit .yarn/cache with many ZIP files into the repo.
"Vendoring" inherently brings tradeoffs between repo size (longer clone times, more data transferred, local storage requirements etc.) and reliability / reproducibility of installs.

Summarizing other, (too?) long answers:
Vendoring is hard-coding the often forked version of a dependency.
This typically involves static linking or some other copy but it doesn't have to.
Right or wrong, the term "hard-coding" has an old and bad reputation. So you won't find it near projects openly vendoring, however I can't think of a more accurate term.

As far as I know the term comes from Ruby on Rails.
It describes a convention to keep a snapshot of the full set of dependencies in source control, in directories that contain package name and version number.
The earliest occurrence of vendor as a verb I found is the vendor everything post on err the blog (2007, a bit before the author co-founded GitHub). That post explains the motivation and how to add dependencies. As far as I understand the code and commands, there was no special tool support for calling the directory vendor at that time (patches and code snippets were floating around).
The err blog post links to earlier ones with the same convention, like this fairly minimal way to add vendor subdirectories to the Rails import path (2006).
Earlier articles referenced from the err blog, like this one (2005), seemed to use the lib directory, which didn't make the distinction between own code and untouched snapshots of dependencies.
The goal of vendoring is more reproducibility, better deployment, the kind of things people currently use containers for; as well as better transparency through source control.
Other languages seem to have picked up the concept as is; one related concept is lockfiles, which define the same set of dependencies in a more compact form, involving hashes and remote package repositories. Lockfiles can be used to recreate the vendor directory and detect any alterations. The lockfile concept may have come from the Ruby gems community, but don't quote me on that.
The solution we’ve come up with is to throw every Ruby dependency in vendor. Everything. Savvy? Everyone is always on the same page: we don’t have to worry about who has what version of which gem. (we know) We don’t have to worry about getting everyone to update a gem. (we just do it once) We don’t have to worry about breaking the build with our libraries. […]
The goal here is simple: always get everyone, especially your production environment, on the same page. You don’t want to guess at which gems everyone does and does not have. Right.
There’s another point lurking subtlety in the background: once all your gems are under version control, you can (probably) get your app up and running at any point of its existence without fuss. You can also see, quite easily, which versions of what gems you were using when. A real history.

Related

NativeScript, Code Sharing and different environments

Note: this is not a dupe of this or this other question. Read on: this question is specific to the Code-Sharing template.
I am doing some pretty basic experiments with NativeScript, Angular and the code sharing templates (see: #nativescript/schematics).
Now I am doing some exploration / poc work on how different "build configuration" are supported by the framework. To be clear, I am searching for a simple -and hopefully official- way to have the application use a different version of a specific file (let's call it configuration.ts) based on the current platform (web/ios/android) and environment (development/production/staging?).
Doing the first part is obviously trivial - after all that is the prime purpose of the code sharing schematics. So, different versions of the same file are identified by different extensions. This page explain things pretty simply.
What I don't get as easily is if the framework/template supports any similar convention-based rule that can be used to switch between debug/release (or even better development/staging/production) versions of a file. Think for example of a config.ts file that contains different parameters based on the environment.
I have done some research in the topic, but I was unable to find a conclusive answer:
the old and now retired documentation for the appbuilder platform mentions a (.debug. and .release.) naming convention for files. I don't think this work anymore.
other sources mention passing parameters during the call to tns build / tns run and then fetching them via webpack env variable... See here. This may work, but seems oddly convoluted
third option that gets mentioned is to use hooks to customize the build (or use a plugin that should do the same)
lastly, for some odd reason, the #nativescript/schematics seems to generate a default project that contains two files called environment.ts and environment.prod.ts. I suspect those only work for the web version of the project (read: ng serve) - I wasn't able to get the mobile compiler to recognize files that end with debug.ts, prod.ts or release.ts
While it may be possible that what I am trying to do isn't just supported (yet?), the general confusion an dissenting opinions on the matter make me think I may be missing something.. somewhere.
In case this IS somehow supported, I also wonder how it may integrate with the NativeScript Sidekick app that is often suggested as a tool to ease the build/run process of NativeScript applications (there is no way to specify additional parameters for the tns commands that the Sidekick automates, the only options available are switching between debug/release mode), but this is probably better to be left for another question.

Environment files are not yet supported, passing environment variables from build command could be the viable solution for now.
But of course, you may write your own schematics if you like immediate support for environment files.

I did not look into sharing environment files between web and mobile yet - I do like Manoj's suggestion regarding modifying the schematics, but I'll have to cross that bridge when I get there I guess. I might have an answer to your second question regarding Sidekick. The latest version does support "Webpack" build option which seems to pass the --bundle parameter to tns. The caveat is that this option seems to be more sensitive to typescript errors, even relatively benign ones, so you have to be careful and make sure to fix them all prior to building. In my case I had to lock the version of #types/jasmine in package.json to "2.8.6" in order to avoid some incompatibility between that and the version of typescript that Sidekick's cloud solution is using. Another hint is to check "Clean Build" after npm dependency changes are made. Good luck!

Composer dependency with minor differences in composer.json file

I'm creating an application that is going to work on two separate versions of the same software. These frameworks will have completely different modules with a shared dependency to a JavaScript framework I've built.
Let's say
dave/version1
dave/version2
Which both share the dependency via a require of
dave/framework
I want to maintain this framework (dave/framework) in one repository that both of the parent modules can require. However the location that these framework files need to be placed are slightly different between the two modules, along with slightly different requirements for the composer.json files to ensure everything gets moved around correctly (the two versions of these software implement composer rather differently).
With my limited knowledge of composer and Git I've formulated a couple of solutions:
Create three repositories, two wrapper repositories with specific composer.json files to support each different version of the software. With another dependency to a third repository which contains the actual framework. I'm unsure if this would ever work outside of a theory. Also ends up being a little messy.
Use some form of clever tagging and have version1 and version2 depend on separate versions of the framework which would in turn have slightly different makeup. Composer would then struggle with pulling the latest version of the module as we'd be running two ever so slightly different code bases in weird versions.
However both of these seem potentially messy and the incorrect way of structuring what I'm trying to achieve.
Is there a nice way to achieve this? or am I better of maintaining two separate repositories for the framework?

With some correct planning, your first option is going to be easier than you think.
Git has a system called Submodules built right in.
Version1 and Version2 repos would include the Frameworks repo within them. Then when you update Framework, you can pull those updates no problem in Version1/2. You control when and how to, so you can make sure you don't introduce a bug down the road either.
Here is some documentation for submodules: https://git-scm.com/book/en/v2/Git-Tools-Submodules
https://git-scm.com/docs/git-submodule

how to configure Apache + SVN webDAV directory listing

I have an subversion server running with Apache mod_dav_svn and it works nicely but the browsing ability via HTML is a bit spartan. Is there a way to customize it at all?
There's two things I'd like to do to make a huge difference:
separate the directories from the files so all the directories are at the top. Right now everything is in alphabetical order. (the picture above happens to have all the directories preceding files in alphabetical order, but trust me, that's not the normal case)
List the basic file statistics (file size, mod time, last updated version, etc)
Is it posssible to do either of these with mod_dav_svn?

In a vanilla Subversion install, the web interface is very spartan by design. (Remember the HTTP interface is designed for SVN clients, not human beings.)
You can customize the display somewhat via the SVNIndexXSLT directive. (Here is a good place to start).
If you want something richer (with logs and diff features), you will need to install a special front end. WebSVN and ViewVC are very popular. There is also Trac, but this is a higher-level tool.
A list of other repo browsing tools.
Just FYI, we use WebSVN for our repo instance. It took some effort to get it up and running, but once it is setup you can pretty much leave it alone.

WebSvn looks like it might help you. I tried trac and it is very slick but I found it to be complicated and seems overkill for what you're looking for, imo.

Not out of the box - that is, without modifying the source code. You might be interested in tools like ViewSVN or the more sophisticated trac or redmine.

Mercurial: How to manage common/shared code

I'm using Mercurial for personal use and am conteplating it for some distributed projects as an alternative to SVN for various reasons.
I'm getting comfortable with using it for self contained projects and can see various options for sharing however I haven't yet found any guidance on managing common libraries to be included in multiple projects in a similar manner to that provided by externals in subversion.
The most obvious shared lump of code is error handling and reporting - we want this to be pretty much the same in all projects (its fairly well evolved). There is also utility code, control libraries and similar that we find better to have as projects built with each solution than to pull in as compiled classes (not least because it ensures they are kept up to date, continuous integration helps us address breaking changes).
Thoughts (I hate open ended questions, but I want to know what, if anything, others are doing).

Mercurial 1.3 now includes nested repository support, which can be used to express dependencies. The other option is to let your build system handle the download and tracking of dependencies using something like ivy or maven though those are more focused on pulling down compiled code.

The world has changed since I asked that question and the solution I now use is different.
The simple answer is now to use packages (specifically NuGet as I do .NET) to deliver the common code instead of nesting repos and including the projects in a solution.
So I have common code built into NuGet packages by and hosted using TeamCity and where previously I would have an external and include the project/source I would now just reference the package.

Use the Forest Extension it emulates svn externals for HG, to some extent that is.

Subrepository (with good guide) or Guestrepo "to overcome ... limitations" (of subrepos) is today's language-agnostic answer

Best practices for version information?

I am currently working on automating/improving the release process for packaging my shop's entire product. Currently the product is a combination of:
Java server-side codebase
XML configuration and application files
Shell and batch scripts for administrators
Statically served HTML pages
and some other stuff, but that's most of it
All or most of which have various versioning information contained in them, used for varying purposes. Part of the release packaging process involves doing a lot of finding, grep'ing and sed'ing (in scripts) to update the information. This glue that packages the product seems to have been cobbled together in an organic, just-in-time manner, and is pretty horrible to maintain. For example, some Java methods create Date objects for the time of release, the arguments for which are updated by a textual replacement, without compiler validation... just, urgh.
I'm trying avoid giving examples of actual software used (i.e. CVS, SVN, ant, etc.) because I'd like to avoid the "use xyz's feature to do this" and concentrate more on general practices. I'd like to blame shoddy design for the problem, but if I had to start again, still using varying technologies, I'd be unsure how best to go about handling this, beyond laying down conventions.
My questions is, are there any best practices or hints and tips for maintaining and updating versioning information across different technologies, filetypes, platforms and version control systems?

Create a properties file that contains the version number and have all of the different components reference the properties file
Java files can reference the properties through
XML can use includes?
HTML can use a JavaScript to write the version number from the properties in the HTML
Shell scripts can read in the file

Indeed, to complete Craig Angus's answer, the rule of thumb here should be to not include any meta-informations in your normal delivery files, but to report those meta-data (version number, release date, and so on) into one special file -- included in the release --.
That helps when you use one VCS (Version Control System) tool from the development to homologation to pre-production.
That means whenever you load a workspace (either for developing, or for testing or for preparing a release into production), it is the versionning tool which gives you all the details.
When you prepare a delivery (a set of packaged files), you should ask that VCS tool about every meta-information you want to keep, and write them in a special file itself included into the said set of files.
That delivery should be packaged in an external directory (outside any workspace) and:
copied to a shared directory (or a maven repository) if it is a non-official release (but just a quick packaging for helping the team next door who is waiting for your delivery). That way you can make 10 or 20 delivers a day, it does not matter: they are easily disposable.
imported into the VCS in order to serve as official deliveries, and in order to be deployed easily since all you need is to ask the versionning tool for the right version of the right deliver, and you can begin to deploy it.
Note: I just described a release management process mostly used for many inter-dependant projects. For one small single project, you can skip the import in the VCS tool and store your deliveries elsewhere.

In addition to Craig Angus' ones include the version of tools used.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008