What is the difference between github container registry and github artifact space? - github-actions

I uploaded many GitHub artifacts, causing the GitHub free storage space (50 GB) to run out.
Most of these artifacts were copies or had very small changes.
Assuming that these were stored in layers as diffs from parent images (like docker images are stored), it was unlikely that 50 GB of space would run out.
Are GitHub artifacts stored as individual files every time a workflow is run ?
Are GitHub packages stored in the same way ?
Is the storage for packages and artifacts the same ?

GitHub's artifacts are usually linked with:
release: artifacts could be another term for "assets": files associated with a particular GitHub release (see Update a release asset for instance).
or workflow artifacts: An artifact is a file or collection of files produced during a workflow run. It allows you to persist data after a job has completed, and share that data with another job in the same workflow.
As Edward Thomson (from GitHub) notes: "because they're meant to be used to move data between jobs in a workflow, workflow assets are not permanent".
GitHub Container Registry is dedicated to store and manage Docker and OCI images.
Only the latter benefit from incremental storage through layers.
The former would be uploaded as a all for each file.
From the comments below:
A workflow where one authenticates to GHCR (GitHub Container Registry) and push to the registry an image (docker push ghcr.io/OWNER/IMAGE_NAME:tag) will benefit from an incremental layer-by-layer storage.
This differ from a regular asset upload, where the artifact is stored as a all.

Related

Github Action how to deal with standalone config file

We are using Github Action to deploy our code. On push, the source code will be pushed and we were able to build the code and deploy successfully if the config file is also tracked by the repository. However, we are encountering a problem with a config file in .gitignore.
Our app has different versions, controlled by this config file, and also this file is different from testing to production. Therefore, this file is standalone and not tracked by the git repository. However, for Github actions to build the project correctly, this file is necessary and has to be placed on a certain path of the project, e.g., /configs/env_configs.json.
This seems like a very common use case but I find very little information in Github action's document.
Is there a good way to work this out?

Managing composer and deployment

So, I'm enjoying using composer, but I'm struggling to understand how others use it in relation to a deployment service. Currently I'm using deployhq, and yes, I can set it to deploy and run composer when there is an update to the repo, but this doesn't make sense to me now.
My main composer repo, containing just the json file of all of the packages I want to include in my build, only gets updated when I add a new package to the list.
When I update my theme, or custom extension (which is referenced in the json file), there is no "hook" to update my deployment service. So I have to log in to my server and manually run composer (which takes the site down until it's finished).
So how do others manage this? Should I only run composer locally and include the vendor folder in my repo?
Any answers would be greatly appreciated.
James
There will always be arguments as to the best way to do things such as this and there are different answers and different options - the trick is to find the one that works best for you.
Firstly
I would first take a step back and look at how you are managing your composer.json
I would recommend that all of your packages in composer.json be locked down to the exact version number of the item in Packagist. If you are using github repo's for any of the packages (or they are set to dev-master) then I would ensure that these packages are locked to a specific commit hash! It sounds like you are basically there with this as you say nothing updates out of the packages when you run it.
Why?
This is to ensure that when you run composer update on the server, these packages are taken from the cache if they exist and to ensure that you dont accidentally deploy untested code if one of the modules happens to get updated between you testing and your deployment.
Actual deployments
Possible Method 1
My opinion is slightly controversial in that when it comes to Composer for many of my projects that don't go through a CI system, I will commit the entire vendor directory to version control. This is quite simply to ensure that I have a completely deployable branch at any stage, it also makes deployments incredibly quick and easy (git pull).
There will already be people saying that this is unnecessary and that locking down the version numbers will be enough to ensure any remote system failures will be handled, it clogs up the VCS tree etc etc - I won't go into these now, there are arguments for and against (a lot of it opinion based), but as you mentioned it in your question I thought I would let you know that it has served me well on a lot of projects in the past and it is a viable option.
Possible Method 2
By using symlinks on your server to your document root you can ensure that the build completes before you switch over the symlink to the new directory once you have confirmed the build completed.
This is the least resistance path towards a safe deployment for a basic code set using composer update on the server. I actually use this method in conjunction with most of my deployments (including the ones above and below).
Possible Method 3
Composer can use "artifacts" rather than a remote server, this will mean that you will basically be creating a "repository folder" of your vendor files, this is an alternative to adding the entire vendor folder into your VCS - but it also protects you against Github / Packagist outages / files being removed and various other potential issues. The files are retrieved from the artifacts folder and installed directly from the zip file rather than being retrieved from a server - this folder can be stored remotely - think of it as a poor mans private packagist (another option btw).
IMO - The best method overall
Set up a CI system (like Jenkins), create some tests for your application and have them respond to push webhooks on your VCS so it builds each time something is pushed. In this build you will set up the system to:
run tests on your application (If they exist)
run composer update
generate an artifact of these files (if the above items succeed)
Jenkins can also do an actual deployment for you if you wish (and the build process doesn't fail), it can:
push the artifact to the server via SSH
deploy the artifact using a script
But if you already have a deployment system in place, having a tested artifact to be deployed will probably be one of its deployment scenarios.
Hope this helps :)

Jenkins projects pointing to same Mercurial repo do not share source

I am using Jenkins for our build server. I have multiple projects using the same Mercurial (Hg) repository and want to avoid each project cloning it's own local repo to build from (since the repo is rather large). This is supposed to be possible via Jenkins and the Mercurial plugin.
In my Mercurial plugin configuration I have checked both "Use Repository Caches" and "Use Repository Sharing". In each project, the same repository location (a network location specified via IP address) is listed.
However, each project still seems to want to create a clone of the repository. Any ideas?
In our setup (using Jenkins 1.506), I've defined a custom workspace under the Advanced Project Options for each of my builds, typically at [project]\repo and then build from there into a \build\ folder.
If you define the custom workspace for each Jenkins project to point to the same shared custom workspace using the same source for the repo it will reuse what is already there.
I've not tested this, but I would assume that under this setup, it is important to prevent concurrent builds from occurring in the same working directory. Bad things would follow.
As a followup question: What is your rationale for not wanting each build to have its own source code?

Can Jenkins store artifacts outside the job directory?

I currently have Jenkins set up with a number of jobs, but it's proving difficult to back up because the artifacts are stored within the job directory. I'd like to back up the job configurations and artifacts separately. I'm sure I remember reading somewhere that Jenkins now has an option to store them outside the job, but I can't find this.
Is there any configuration option that does this while still making the artifacts visible from within the job on the Jenkins interface? (ie rather than merely an add-in that copies the artifacts elsewhere)
Go to your jenkins configuration page, e.g.
http://mybuildserver.acme.com/configure
At the top of the configuration page there is a "home directory" setting. Click the "advanced..." button below it.
Now set the "Workspace Root Directory" to e:\jenkins-workspaces\${ITEM_FULL_NAME}, and "Build Record Root Directory" to e:\jenkins-builds\${ITEM_FULL_NAME} or something similar.
Warning: I run Jenkins 2.7.2 and noticed that certain features don't work properly after configuring Jenkins like that. I saw problems with folders and problems with the multi-branch project plugin. Check the status of those issues if your rely on these features.
As you can see here, there are many plugins to deploy artifacts anywhere you want/need, on FTP, CIFS, Confluence, Artifactory.... especially the ArtifactsDeployer that will allow you to make a copy of the artifacts in the Jenkins Home.
Thank you Sam, for your post, which directed me into the right direction to solve my problem.
Have been searching for a way on how can I make a symlink to the Job-Archive of a build for multibranch projects. Up to now, we used to manually search for the correct folder basename in the filesystem and added that one to the Jenkinsfile.
Now, I can simply use
jobOutputFolder = currentBuild.rawBuild.artifactsDir.path
and use that in my script.
If security is a concern, I could implement that as a shared library additionally.
Try the Use Custom Workspace build option. From the Jenkins popup help:
For each job on Jenkins, Jenkins allocates a unique "workspace
directory." This is the directory where the code is checked out and
builds happen. Normally you should let Jenkins allocate and clean up
workspace directories, but in several situations this is problematic,
and in such case, this option lets you specify the workspace location
manually.
This option is also available under advanced project properties of multi-configuration project builds.
A groovy script under "Prepare an environment for the run" will always run on the master, and this groovy script can create a symlink to where you really want artifacts archiving to archive_to which SHOULD include the job name and build number:
if (! Files.createSymbolicLink(Paths.get(currentBuild.artifactsDir.path),
Paths.get(archive_to.getCanonicalPath()))) {
throw new RuntimeException("Can't create symlink to archive dir")
}
Of course (sadly) when old builds are purged by Jenkins the old artifacts are left because jenkins will not follow a symlink when purging, even if jenkins owns the symlink and the target (shame).
I workaround for that may be to point a symlink back from the new archive dir, then, when jenkins purges it's archive dir, the new symlink will dangle and a cron job can then later delete the new job archive dir
Copy Artifact Plugin (https://wiki.jenkins-ci.org/display/JENKINS/Copy+Artifact+Plugin) adds a build step for retrieving files from another project's workspace to current and work from there.

On a Hudson master node, what are the .tmp files created in the workspace-files folder?

Question:
In the path HUDSON_HOME/jobs/<jobname>/builds/<timestamp>/workspace-files, there are a series of .tmp files. What are these files, and what feature of Hudson do they support?
Background
Using Hudson version 1.341, we have a continuous build task that runs on a slave instance. After the build is otherwise complete, including archiving the artifacts, task scanner, etc., the job appears to hang for a long period of time. In monitoring the master node, I noted that many .tmp files were being created and modified under builds//workspace=files, and that some of them were very large. This appears to be causing the delay, as the job completed at the same time that files in this path stopped changing.
Some key configuration points of the job:
It is tied to a specific slave node
It builds in a 'custom workspace'
It runs the Task Scanner plugin on a portion of the workspace to find "todo" items
It triggers a downstream job that builds in the same custom workspace on the same slave node
In this particular instance, the .tmp files were being created by the Task Scanner plugin. When tasks are found, the files in which they are found are copied back to the master node. This allows the master node to serve those files in the browser interface for Tasks.
Per this answer, it is likely that this same thing occurs with other plug-ins, too.
Plug-ins known to exhibit this behavior (feel free to add to this list)
Task Scanner
Warnings
FindBugs
There's an explanation on the hudson users mailing list:
...it looks like the warnings plugin copies any files that have compiler warnings from the workspace (possibly on a slave) into a "workspace-files" directory within HUDSON_HOME/jobs//builds/
The files then, I surmise, get processed resulting in a "compiler-warnings.xml" file within the HUDSON_HOME/jobs//builds/
I am using the "warnings" plugin, and I suspect it's related to that.