Compare build artifacts from two different commits via github actions - github-actions

I've got a workflow in github actions that automatically creates build artifacts and updates a single release with these new build artifacts every time I merge a PR into main (here's the repo).
I want to know if a new PR will cause a change in the build artifacts (specifically, there's just one CSV file that I care about). Sometimes these changes will be intentional, sometimes not, so I want something like a git diff between the CSV file before the PR and the CSV file after the PR.
I know I could setup a github action to:
checkout the old version of the code.
Run the code to generate the build artifacts
save the files of interest to disc
checkout the proposed version of the code from the PR
Run the PR code to generate the build artifacts
git diff the version before the PR to the version after the PR.
Format and write the git diff output as a comment to the PR, letting
me know about what changes there were so I can check that everything's ok manually.
But this seems like a really common problem and I can't believe there's not a simple
tool/solution out there already? Maybe some github action where you give it two SHAs, a command to run, and a list of files to git diff.
To be clear, these are build artifacts, so aren't tracked by git, and so solutions like git diff pullrequest main -- myfile.csv won't work.

Here is a solution that leverages git notes:
(In a nutshell, git notes allow you to CRUD metadata to a commit without touching the commit itself — and thus preserving history. Cf. § References below.)
Essentially, we want our workflow to:
Build the artefactsWe emulate this by running make build — to be adapted to your own scenario. For the sake of the example, we also assume that the build/ directory contains all and only the artefacts generated.
“Remember” the artefacts and their content (a so-called “artefacts summary”)We use the sha512sum shell command to create a mapping of artefacts' content (represented through their SHA sum) to their file name.We retrieve all artefacts via find results/ -type f, and then convert the mapping to a CSV with headers using sed 's/ /,/' | cat <(echo 'sha512,file_name') -
Attach the artefacts summary to the commitThat's where we leverage git notes, which allows us to add metadata to the commit ex-post, without modifying the history.
These steps should be executed for any commit on your main branch.
In case of a PR, you also want to repeat these steps on the branch's HEAD, plus:
Retrieve the artefacts summary of your PR's target branchSo you now have two artefacts summaries to compare: base (your main/master branch's one) and head (the branch of your PR). In the example below, the base is hard coded to main, but you could refine this by letting the workflow retrieve the target branch's name automatically.
Compare both artefacts summariesI've created the artefactscomparison Python package for that purpose. (Note: it's very much tailored to my use case and desiderata.)
Add the artefact comparison report to your PRBeebop, a bot will do that for you.
In the end, you should see something like on the screenshot above.
name: Artefacts Comparison
on:
push:
branches:
- main
pull_request:
branches:
permissions: write-all
jobs:
build_artefacts:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout#v3
with:
fetch-depth: 0
token: ${{ github.token }}
- name: Build artefacts
run: make build
- name: Generate artefacts summary
id: artefacts-summary
run: |
echo "ARTEFACTS_SUMMARY<<EOF" >> $GITHUB_OUTPUT
find build/ -type f -exec sha512sum {} \; | sed 's/ /,/' | cat <(echo 'sha512,file_name') - >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Add the artefacts summary as a git notes
run: |
git fetch origin refs/notes/*:refs/notes/*
git config user.name "github-actions"
git config user.email "bot#github.com"
git notes add -m "${{ steps.artefacts-summary.outputs.ARTEFACTS_SUMMARY }}"
git notes show
git push origin refs/notes/*
# In case of PR, add report of artefacts comparison
compare_artefacts:
runs-on: ubuntu-latest
if: ${{ github.event_name == 'pull_request' }}
steps:
- name: Checkout
uses: actions/checkout#v3
with:
fetch-depth: 0
token: ${{ github.token }}
- name: Pull artefacts summaries (i.e., git notes) from upstream
run: |
git fetch origin refs/notes/*:refs/notes/*
- name: Retrieve PR's head branch's artefacts summary
id: artefact-summary-head
run: |
echo "ARTEFACTS_SUMMARY<<EOF" >> $GITHUB_OUTPUT
git notes show >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Retrieve PR's target branch's artefacts summary
id: artefact-summary-base
run: |
git checkout ${{ github.base_ref }}
echo "ARTEFACTS_SUMMARY<<EOF" >> $GITHUB_OUTPUT
git notes show >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Setup Python
uses: actions/setup-python#v4
with:
python-version: "3.10"
- name: Install artefactscomparison package
run: pip install -U artefactscomparison
- name: Generate artefact comparison report
id: artefact-comparison-report
run: |
echo "${{ steps.artefact-summary-head.outputs.ARTEFACTS_SUMMARY }}" > head.csv
echo "${{ steps.artefact-summary-base.outputs.ARTEFACTS_SUMMARY }}" > base.csv
echo "ARTEFACTS_REPORT<<EOF" >> $GITHUB_OUTPUT
artefacts_comparison -b base.csv -h head.csv >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- name: Comment PR with artefact comparison report
uses: thollander/actions-comment-pull-request#v2
with:
message: ${{ steps.artefact-comparison-report.outputs.ARTEFACTS_REPORT }}
comment_tag: artefact_comparison_report
mode: recreate
needs: build_artefacts
References:
git notes documentation
How to sync (i.e. “pull” and push) git notes with upstream
git notes | Enhance Git Commit Messages with Notes
Git Notes: git's coolest, most unloved­ feature
Automatically add git notes via Github Actions

Related

Get latest file SHA given filename of changed files in push

Hello I'm a little confused if it is possible via Github Actions to get the latest SHA of a file with only its file's name.
# This is a basic workflow to help you get started with Actions
name: CI
# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
push:
branches: [ master ]
pull_request:
branches: [ master ]
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
runs-on: ubuntu-latest
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout#v2
with:
fetch-depth: 0
- name: Get specific changed files
id: changed-files-specific
uses: tj-actions/changed-files#v15.1
with:
files: |
*.groovy
files_ignore: |
*.yml
# Runs a set of commands using the runners shell
- name: echo changed files
run: |
echo modified files ---
echo ${{steps.changed-files-specific.outputs.modified_files}}
As you can see with the combination of action changed-files-specific and echo changed files I am able to get the filename. I looked at the documentation of the tj-actions/changed-files library and it does not provide file info support.
Is there an easy way to do this? I tried searching for another action library but it does not seem to be a very common use case.
Many Thanks,
Morgan Morningstar
You are on the right track.
Now when you have all the modified files and their paths - you can just easily do whatever you want with those files.
You can iterate over those files and calculate SHA for each of them using those paths.
Something like this:
for file in ${{ steps. changed-files-specific.outputs.modified_files }}; do
sha=`sha1sum $file | cut -d ' ' -f 1`
echo "sha for $file: $sha"
done

How to get all the changes of a Pull Request when triggering on pull_request_review?

I currently have a GitHub Action that triggers on:
pull_request_review:
types: [submitted]
I then want to run a command, which expects the contents of changes of the Pull Request.
Previously, I was using
on:
push
and I had no issues with the contents of the files being available in the Action context.
However, my command is failing now, and I think it's because the context only includes the commit that the action was triggered on (no file changes.)
Previously I was running this action on push and that was always successful, with the file changes being available in the context.
I'm using:
steps:
- uses: actions/checkout#v2
(https://github.com/actions/checkout)
Is it possible to use this to have all the file changes on the Pull Request within the Action context?
Any help on this would be appreciated!
You can do that by using an open source Action available on marketplace:
jobs:
build:
runs-on: ubuntu-latest # windows-latest | macos-latest
name: Test changed-files
steps:
- uses: actions/checkout#v2
with:
fetch-depth: 0 # OR "2" -> To retrieve the preceding commit.
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files#v14.6
- name: List all changed files
run: |
for file in ${{ steps.changed-files.outputs.all_changed_files }}; do
echo "$file was changed"
done
The solution above uses git checkout and git diff to get files changed by PR. Alternatively if you really need just information about paths changed and you don't really need files themselves (no checkout) - you can do it without checkout using gh CLI:
gh pr view XXX --json files -q '.files[].path'
You can run it like this:
jobs:
comment:
runs-on: ubuntu-latest
steps:
- run: gh pr view XXX --json files -q '.files[].path'
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Tag a different branch than master on pull request

I'm new at github actions and there's something I clearly don't understand, so hoping for some help. I've inherited a repo with a workflow consisting of a number of steps, and while I'm getting the overall picture, there's also a part just not working.
What I want to achieve
My workflow builds a number of containers and a binary. These are uploaded to different places, all with version number and tag matching. This works great for a release as everything is tagged properly and references each other. However, I want this to work for an unmerged pull-request as well - this would give the possibility to test things properly.
The problem
The github repo is tagged, but it is always the HEAD of master getting tagged, also on pull requests. For pull requests, I expect the commit of the PR branch to get tagged - but nope.
The workflow below runs as I expect it to - the pre-release step runs for pull request actions, while the release steps runs when the pull request is merged. However, for each pull request action, the HEAD of master is tagged with the generated tag.
Workflow
on:
push:
branches:
- "master"
pull_request:
branches: [ master ]
jobs:
tag:
runs-on: ubuntu-latest
outputs:
name: ${{ steps.generate_tag.outputs.name }}
steps:
- uses: actions/checkout#v2.3.4
with:
ref: ${{ github.sha }}
- name: Generate tag name
id: generate_tag
run: echo "::set-output name=name::v$(date -d "$(git show -s --format=%ci)" +'%Y.%-m.%-d-%-H%M%S')"
go:
runs-on: macos-latest
needs: [tag]
steps:
- uses: actions/checkout#v2.3.4
with:
ref: ${{ github.sha }}
- name: Tag actual releases as releases
if: "contains(github.ref, 'master')"
run: |
git config user.name "GitHub Actions"
git config user.email noreply#github.com
git tag "${{ needs.tag.outputs.name }}" -m "Version: ${{ needs.tag.outputs.name }}"
git checkout "${{ needs.tag.outputs.name }}"
git branch
- name: Tag PRs as prerelease
if: "!contains(github.ref, 'master')"
run: |
git config user.name "GitHub Actions"
git config user.email noreply#github.com
git tag "${{ needs.tag.outputs.name }}"
git checkout "${{ needs.tag.outputs.name }}"
git branch
On Pull Request event trigger from any branch to master the ref: ${{ github.sha }} does not point to the last commit of that branch. Since you're using push and pull_request events for master and other branches then you should be able to just remove the usage of ref during actions checkout step.
Try the following step to checkout current branch code that generated the event. The following will be your branch latest commit on pull request, and it will be master after commit merged during the push event.
- uses: actions/checkout#v2.3.4
You can find more about github actions here topics here

Github Actions variables point to fork parent

I've forked a project:
and I'm setting up a CI definition on non master branches with these final steps:
- name: Zip the release
uses: papeloto/action-zip#v1
with:
files: README.md LICENSE *.dll nfive.yml nfive.lock fxmanifest.lua index.html config/
dest: ${{ github.workspace }}\nfive.zip
- name: Attach Zip as build artifact
uses: actions/upload-artifact#v1
with:
name: nfive
path: ${{ github.workspace }}\nfive.zip
Why does github.workspace point to the original repository NFive\NFive?
So, if run echo ${{ github.workspace }} it definitely shows the parent repository, but to make this even more difficult if I change directory to my organisation name and forked repository name I get this:
which is the output of these steps:
- run: echo ${{ github.workspace }}
- name: Move files to artifact folder
shell: pwsh
run: |
cd D:\a\HTB-5M\NFive
mkdir Build
Move-Item -Path README.md,LICENSE,*.dll,nfive.yml,nfive.lock,fxmanifest.lua,index.html,config -Destination Build
I don't have access to the parent path because I'm not a contributor, which is why I forked in the first place, so why does Github assume github.workspace should map to the parent?
=== Udate ===
So I reforked to my user account rather than the organisation I used and added a workflow step to just display the github.workspace variable and it definitely says the parent workspace D:\a\NFive\NFive
I tried changing the path in the nfive.yml
Same outcome
Deleting the nfive.yml doesn't change anything. I think the nfive.yml is actually used in the CI / CD pipelines the original authors configured on appveyor so it isn't going to affect anything here.
It isn't pointing to the fork parent repo, it's because the github.workspace variable follows this format: /home/runner/work/my-repo-name/my-repo-name.
That's the reason why NFive appears twice, it's NOT refering to NFive username / organization, but twice to the NFive repository name.
This is explained on the github documentation about default environment variables
I forked the repository as well and I don't get any error when running this workflow.
Here is the workflow run output with the generated .zip file artifact.

How to setup Github action to only run if a specified git tag does not exist

I'm working to create a github action that will create a release draft. In the action I'd like to only run the release code if the app version does not have an respective git tag
The current action yaml looks similar to:
# ...
jobs:
# test, winbuild and linuxbuild jobs
draftrelease:
needs: [test, winbuild, linuxbuild]
runs-on: ubuntu-latest
# if ${{jobs.test.steps.appversion.outputs.version}} is not a tag
steps:
# ...
I know I can use the following to print if the tag exists, but I need to check if the tag does not exist within the if:
git show-ref --tags --verify -- "refs/tags/${{jobs.test.steps.appversion.outputs.version}}"
How would I go about setting up the job to only run if jobs.test.steps.appversion.outputs.versions is not a git tag?
I managed to achieve this by using a module that reads the app version, checking git tags, and then checking a variable at each step of the build:
jobs:
build:
name: Compile Bundles
strategy:
matrix:
os: [windows-latest, ubuntu-latest]
runs-on: ${{ matrix.os }}
steps:
- name: Checkout master branch
uses: 'actions/checkout#v2'
# Fetches all tags for the repo
- name: Fetch tags
run: git fetch --depth=1 origin +refs/tags/*:refs/tags/*
# Reads the app's version
# In my case, I have a nodejs project, so I read the app's version
# from package.json. You may need to find a different github action
# module specific to your language to extract the app version
- name: Read package.json
id: package
uses: gregoranders/nodejs-project-info#v0.0.1
# Check if the app version has a git tag
# If there is a git tag for the version set the variable 'tagged' to 0
# if there is NOT a git tag for the version set the variable 'tagged' to 1
- name: 'Check: package version has corrosponding git tag'
id: tagged
shell: bash
run: git show-ref --tags --verify --quiet -- "refs/tags/v${{ steps.package.outputs.version }}" && echo "::set-output name=tagged::0" || echo "::set-output name=tagged::1"
- name: Step to only run if there is no git-tag for the version
if: steps.tagged.outputs.tagged == 1
# ...
- name: Another step to only run if there is no git-tag for the version
if: steps.tagged.outputs.tagged == 1
# ...
The two big things to note is the git --show-ref line and then the if: statements that follow in later steps
# Attempt to output the tagged reference for the app's version
# In my case all version tags are prefixed with 'v' so you may need to alter
# this line to better suit your needs
git show-ref --tags --verify --quiet -- "refs/tags/v${{ steps.package.outputs.version }}"
# If outputting the tag was successful set the variable indicating the tag exists
&& echo "::set-output name=tagged::0"
# if outputting failed/errored, set the variable indicating the tag does not exist
|| echo "::set-output name=tagged::1
Once the above has been ran, then the github actions variable steps.tagged.outputs.tagged will be 0 if the version has been tagged, and 1 if the version has not been tagged.
From there you just need to check that variable with each step you only want to run if the version has not been tagged:
if: steps.tagged.outputs.tagged == 1
Git Show-ref Documentation
Github Actions Documentation