How can I find out what is contained in a saved Docker image tar file? - json

Let's say that I have saved one or more Docker images to a tar file, e.g.
docker save foo:1.2.0 bar:1.4.5 bar:latest > images.tar
Looking at the tar file, I can see that in addition to the individual layer directories, there is a manifest.json file that contains some meta information about the archives contents, including a RepoTags array for each image:
[{
"Config": "...",
"Layers": [...],
"RepoTags": [
"foo:1.2.0"
]
},
{
"Config": "...",
"Layers": [...],
"RepoTags": [
"bar:latest",
"bar:1.4.5"
]
}]
Is there an easy way to extract that info from the tar file, e.g. through a Docker command - or do I have to extract the manifest.json file, run it through a tool like jq and then collect the tag info myself?
The purpose is to find out what is contained in the archive before importing/loading it on a different machine. I imagine that there must be some way to find out what's in the archive...

Since I have not received any answers so far, I'll try to answer myself using what I have tried, and it's working for me.
Using a combination of tar and jq, I came up with this command:
tar -xzOf images.tar.gz manifest.json | jq '[.[] | .RepoTags] | add'
This will extract the manifest.json file to stdout, pipe it into in the jq command, and jq combines the various RepoTags arrays into a single array:
[
"foo:1.2.0",
"bar:1.4.5",
"bar:latest"
]
The result is easy to read and works for me. Only downside is that it requires an installation of jq. I would love to have something that works without dependencies.
Still looking for a better answer, don't hesitate to post an answer if you have something that's easier to use!

Related

Insert header for each document before uploading to elastic search

I have a ndjson file with the below format
{"field1": "data1" , "field2": "data2"}
{"field1": "data1" , "field2": "data2"}
....
I want to add a header like
{"index": {}}
before each document before using the bulk operation
I found a similar question: Elasticsearch Bulk JSON Data
The solution is this jq command:
jq -cr ".[]" input.json | while read line; do echo '{"index":{}}'; echo $line; done > bulk.json
But I get this error:
'while' is not recognized as a internal or external command
What am I doing wrong? Im running Windows
Or is there a better solution?
Thanks
The while in your sample is a construct that is usually built-in a developer-friendly shell like e.g. sh, bash or zsh but windows doesn't provide out of the box. See the bash docs for example.
So if this is a one-time thing, probably the fastest solution is to just use some text editor and add the required action lines by using some multi-cursor functionality.
On the other hand, if you are restricted to Windows but want some kind of better shell to use this more often, you should have a look at the cmder project that brings you a bash environment to your windows desktop when using the full version that is packaged with git-for-windows. This should allow you to use such scripting features even on a non linux or mac environment.

How to get json list of files in git repository?

i want to get a list of files from my git repo in JSON format. Have you got any ideas how to do this?
Not exactly sure what you're looking for but
$ git ls-files | jq -MRn '[inputs]'
will generate a JSON array with whatever git ls-files returns e.g.
[
".gitignore",
".gitmodules",
".travis.yml",
"AUTHORS",
"COPYING",
"ChangeLog",
"Dockerfile"
]

Import JSON array in CouchBase

I want to use CouchBase to store lots of data. I have that data in the form:
[
{
"foo": "bar1"
},
{
"foo": "bar2"
},
{
"foo": "bar3"
}
]
I have that in a json file that I zipped into data.zip. I then call:
cbdocloader.exe -u Administrator -p **** -b mybucket C:\data.zip
However, this creates a single item in my bucket; not three as I expected. This actually makes sense as I should be able to store arrays and I did not "tell" CouchBase to expect multiple items instead of one.
The temporary solution I have is to split every items in multiplejson files, then add the lot of them in a single zip file and call cbdocloader again. The problem is that I might have lots of these entries and creating all the files might take too long. Also, I saw in the doc that cbdocloader uses the filename as a key. That might be problematic in my case...
I obviously missed a step somewhere but couldn't find what in the documentation. How should I format my json file?
You haven't missed any steps. The cbdocloader script is very limited at the moment. Couchbase will be adding a cbimport and cbexport tool in the near future that will allow you to add json files with various formats (including the one you mentioned). In the meantime you will need to use the current workaround you are using to get your data loaded.

Prevent packer printing output for certain inline scripts?

I am running a bunch of packer scripts, but some of them generate too much output for logs and it's getting really annoying. Is there any way I can change my json file so that I can disable output for one of these shell scripts in packer?
One example of my packer shell script calls that I'd like silenced:
{
"type": "shell",
"scripts": [
"scripts/yum_install_and_update"
"scripts/do_magic"
]
}
Packer doesn't natively support this, but if you can modify the scripts you could have them internally suppress the output.

Best way to format large JSON file? (~30 mb)

I need to format a large JSON file for readability, but every resource I've found (mostly online) doesn't deal with data say, above 1-2 MB. I need to format about 30 MB. Is there any way to do this, or any way to code something to do this?
With python >= 2.6 you can do the following:
For Mac/Linux users:
cat ugly.json | python -mjson.tool > pretty.json
For Windows users (thanks to the comment from dnk.nitro):
type ugly.json | python -mjson.tool > pretty.json
jq can format or beautify a ~100MB JSON file in a few seconds:
jq '.' myLargeUnformattedFile.json > myLargeBeautifiedFile.json
The command above will beautify a single-line ~120MB file in ~10 seconds, and jq gives you a lot of json manipulation capabilities beyond simple formatting, see their tutorials.
jsonpps is the only one worked for me (https://github.com/bazaarvoice/jsonpps).
It doesn't load everything to RAM unlike jq, jsonpp and others that I tried.
Some useful tips regarding installation and usage:
Download url: https://repo1.maven.org/maven2/com/bazaarvoice/jsonpps/jsonpps/1.1/jsonpps-1.1.jar
Shortcut (for Windows):
Create file jsonpps.cmd in the same directory with the following content:
#echo off
java -Xms64m -Xmx64m -jar %~dp0\jsonpps-1.1.jar %*
Shortcut usage examples:
Format stdin to stdout:
echo { "x": 1 } | jsonpps
Format stdin to file
echo { "x": 1 } | jsonpps -o output.json
Format file to file:
jsonpps input.json -o output.json
Background-- I was trying to format a huge json file ~89mb on VS Code using the command (Alt+Shift+F) but the usuals, it crashed. I used jq to format my file and store it in another file.
A windows 11 use case is shown below.
step 1- download jq from the official site for your respective OS - https://stedolan.github.io/jq/
step 2- create a folder in the C drive named jq and paste the executable file that you downloaded into the folder. Rename the file as jq (Error1: beware the file is by default an exe file so do not save it as 'jq.exe' save it only as 'jq')
step 3- set your path variable to the URL of the executable file.
step 4- open your directory on cmd where the json file is stored and type the following command - jq . currentfilename.json > targetfilename.json
replace currentfilename with the file name that you want to format
replace targetfilename with the final file name that you want your data formatted in
within seconds you should see your target file in the same directory in a formatted version which can now be opened on VS Code or any editor for that matter. Any error related to the recognizability of jq as a command can be traced back with high probability to Error 1.
jq jquery json data-preprocessing data-cleaning
You can use Notepad++ (https://notepad-plus-plus.org/downloads/) for formatting large JSON files (tested in Windows).
Install Notepad++
Go to Plugins -> Plugins Admin -> Install the 'Json Viewer' plugin. The plugin source code is present in https://github.com/kapilratnani/JSON-Viewer
After plugin installation, go to Plugins -> JSON Viewer -> Format JSON.
This will format your JSON file