find and delete nested json object in Go - json

I have a json document of a Kubernetes Pod, here's an example:
https://github.com/itaysk/kubectl-neat/blob/master/test/fixtures/pod-1-raw.json
I'd like to traverse spec.containers[i].volumeMounts and delete those volumeMount objects where the .name starts with "default-token-". Note that both containers and volumeMounts are arrays.
Using jq it took me 1 min to write this 1 line: try del(.spec.containers[].volumeMounts[] | select(.name | startswith("default-token-"))). I'm trying to rewrite this in Go.
While looking for a good json library I settled on gjson/sjson.
Since sjson doesn't support array accessors (the # syntax), and gjson doesn't support getting the path of result, I looked for workarounds.
I've tried using Result.Index do delete the the result from the byte slice directly, and succeeded, but for the query I wrote (spec.containers.#.volumeMounts.#(name%\"default-token-*\")|0) the Index is always 0 (I tried different variations of it, same result).
So currently I have some code 25 line code that uses gjson to get spec.containers.#.volumeMounts and iterate it's way through the structure and eventually use sjson.Delete to delete.
It works, but it feels way more complicated then I expected it to be.
Is there a better way to do this in Go? I'm willing to switch json library if needed.
EDIT: I would prefer to avoid using a typed schema because I may need to perform this on different types, for some I don't have the full schema.
(also removed some distracting details about my current implemetation)

The easiest thing to do here is parse the JSON into an object, work with that object, then serialise back into JSON.
Kubernetes provides a Go client library that defines the v1.Pod struct you can Unmarshal onto using the stdlib encoding/json:
// import "k8s.io/api/core/v1"
var pod v1.Pod
if err := json.Unmarshal(podBody, &pod); err != nil {
log.Fatalf("parsing pod json: %s", err)
}
From there you can read pod.Spec.Containers and their VolumeMounts:
// Modify.
for c := range pod.Spec.Containers {
container := &pod.Spec.Containers[c]
for i, vol := range container.VolumeMounts {
if strings.HasPrefix(vol.Name, "default-token-") {
// Remove the VolumeMount at index i.
container.VolumeMounts = append(container.VolumeMounts[:i], container.VolumeMounts[i+1:]...)
}
}
}
https://play.golang.org/p/3r5-XKIazhK
If you're worried about losing some arbitrary JSON which might appear in your input, you may instead wish to define var pod map[string]interface{} and then type-cast each of the properties within as spec, ok := pod["spec"].(map[string]interface{}), containers, ok := spec["containers"].([]map[string]interface) and so on.
Hope that helps.
ps. The "removing" is following https://github.com/golang/go/wiki/SliceTricks#delete

To take a totally different approach from before, you could create a
type Root struct {
fields struct {
Spec *Spec `json:"spec,omitempty"`
}
other map[string]interface{}
}
with custom UnmarshalJSON which unmarshals into both fields and other, and custom MarshalJSON which sets other["spec"] = json.RawMessage(spec.MarshalJSON()) before returning json.Marshal(other):
func (v *Root) UnmarshalJSON(b []byte) error {
if err := json.Unmarshal(b, &v.fields); err != nil {
return err
}
if v.other == nil {
v.other = make(map[string]interface{})
}
if err := json.Unmarshal(b, &v.other); err != nil {
return err
}
return nil
}
func (v *Root) MarshalJSON() ([]byte, error) {
var err error
if v.other["spec"], err = rawMarshal(v.fields.Spec); err != nil {
return nil, err
}
return json.Marshal(v.other)
}
func rawMarshal(v interface{}) (json.RawMessage, error) {
b, err := json.Marshal(v)
if err != nil {
return nil, err
}
return json.RawMessage(b), nil
}
You then define these sort of types all of the way down through .spec.containers.volumeMounts and have a Container.MarshalJSON which throws away and VolumeMounts we don't like:
func (v *Container) MarshalJSON() ([]byte, error) {
mounts := v.fields.VolumeMounts
for i, mount := range mounts {
if strings.HasPrefix(mount.fields.Name, "default-token-") {
mounts = append(mounts[:i], mounts[i+1:]...)
}
}
var err error
if v.other["volumeMounts"], err = rawMarshal(mounts); err != nil {
return nil, err
}
return json.Marshal(v.other)
}
Full playground example: https://play.golang.org/p/k1603cchwC7
I wouldn't do this.

Related

Parse JSON having sibling dynamic keys alongside with static in Go

I need to parse this json
{
"version": "1.1.29-snapshot",
"linux-amd64": {
"url": "https://origin/path",
"size": 7794688,
"sha256": "14b3c3ad05e3a98d30ee7e774646aec7ffa8825a1f6f4d9c01e08bf2d8a08646"
},
"windows-amd64": {
"url": "https://origin/path",
"size": 8102400,
"sha256": "01b8b927388f774bdda4b5394e381beb592d8ef0ceed69324d1d42f6605ab56d"
}
}
Keys like linux-amd64 are dynamic and theirs amount is arbitrary. I tried something like that to describe it and unmarshal. Obviously it doesn't work. Items is always empty.
type FileInfo struct {
Url string `json:"url"`
Size int64 `json:"size"`
Sha256 string `json:"sha256"`
}
type UpdateInfo struct {
Version string `json:"version"`
Items map[string]FileInfo
}
It's similar to this use case, but has no parent key items. I suppose I can use 3rd party library or map[string]interface{} approach, but I'm interested in knowing how to achieve this with explicitly declared types.
The rest of the parsing code is:
func parseUpdateJson(jsonStr []byte) (UpdateInfo, error) {
var allInfo = UpdateInfo{Items: make(map[string]FileInfo)}
var err = json.Unmarshal(jsonStr, &allInfo)
return allInfo, err
}
Look at the link I attached and you will realize that is not that simple as you think. Also I pointed that I interested in typed approach. Ok, how to declare this map[string]FileInfo to get parsed?
You can create a json.Unmarshaller to decode the json into a map, then apply those values to your struct: https://play.golang.org/p/j1JXMpc4Q9u
type FileInfo struct {
Url string `json:"url"`
Size int64 `json:"size"`
Sha256 string `json:"sha256"`
}
type UpdateInfo struct {
Version string `json:"version"`
Items map[string]FileInfo
}
func (i *UpdateInfo) UnmarshalJSON(d []byte) error {
tmp := map[string]json.RawMessage{}
err := json.Unmarshal(d, &tmp)
if err != nil {
return err
}
err = json.Unmarshal(tmp["version"], &i.Version)
if err != nil {
return err
}
delete(tmp, "version")
i.Items = map[string]FileInfo{}
for k, v := range tmp {
var item FileInfo
err := json.Unmarshal(v, &item)
if err != nil {
return err
}
i.Items[k] = item
}
return nil
}
This answer is adapted from this recipe in my YouTube video on advanced JSON handling in Go.
func (u *UpdateInfo) UnmarshalJSON(d []byte) error {
var x struct {
UpdateInfo
UnmarshalJSON struct{}
}
if err := json.Unmarshal(d, &x); err != nil {
return err
}
var y map[string]json.RawMessage{}
if err := json.Unsmarshal(d, &y); err != nil {
return err
}
delete(y, "version"_ // We don't need this in the map
*u = x.UpdateInfo
u.Items = make(map[string]FileInfo, len(y))
for k, v := range y {
var info FileInfo
if err := json.Unmarshal(v, &info); err != nil {
return err
}
u.Items[k] = info
}
return nil
}
It:
Unmarshals the JSON into the struct directly, to get the struct fields.
It re-unmarshals into a map of map[string]json.RawMessage to get the arbitrary keys. This is necessary since the value of version is not of type FileInfo, and trying to unmarshal directly into map[string]FileInfo will thus error.
It deletes the keys we know we already got in the struct fields.
It then iterates through the map of string to json.RawMessage, and finally unmarshals each value into the FileInfo type, and stores it in the final object.
If you really don't want to unmarshal multiple times, your next best option is to iterate over the JSON tokens in your input by using the json.Decoder type. I've done this in a couple of performance-sensitive bits of code, but it makes your code INCREDIBLY hard to read, and in almost all cases is not worth the effort.

Go reading map from json stream

I need to parse really long json file (more than million items). I don't want to load it to the memory and read it chunk by chunk. There's a good example with the array of items here. The problem is that I deal with the map. And when I call Decode I get not at beginning of value.
I can't get what should be changed.
const data = `{
"object1": {"name": "cattle","location": "kitchen"},
"object2": {"name": "table","location": "office"}
}`
type ReadObject struct {
Name string `json:"name"`
Location string `json:"location"`
}
func ParseJSON() {
dec := json.NewDecoder(strings.NewReader(data))
tkn, err := dec.Token()
if err != nil {
log.Fatalf("failed to read opening token: %v", err)
}
fmt.Printf("opening token: %v\n", tkn)
objects := make(map[string]*ReadObject)
for dec.More() {
var nextSymbol string
if err := dec.Decode(&nextSymbol); err != nil {
log.Fatalf("failed to parse next symbol: %v", err)
}
nextObject := &ReadObject{}
if err := dec.Decode(&nextObject); err != nil {
log.Fatalf("failed to parse next object")
}
objects[nextSymbol] = nextObject
}
tkn, err = dec.Token()
if err != nil {
log.Fatalf("failed to read closing token: %v", err)
}
fmt.Printf("closing token: %v\n", tkn)
fmt.Printf("OBJECTS: \n%v\n", objects)
}
TL,DR: when you are calling Token() method for a first time, you move offset from the beginning (of a JSON value) and therefore you get the error.
You are working with this struct (link):
type Decoder struct {
// others fields omits for simplicity
tokenState int
}
Pay attention for a tokenState field. This value could be one of (link):
const (
tokenTopValue = iota
tokenArrayStart
tokenArrayValue
tokenArrayComma
tokenObjectStart
tokenObjectKey
tokenObjectColon
tokenObjectValue
tokenObjectComma
)
Let's back to your code. You are calling Token() method. This method obtains first JSON-valid token { and changes tokenState from tokenObjectValue to the tokenObjectStart (link). Now you are "in-an-object" state.
If you try to call Decode() at this point you will get an error (not at beginning of value). This is because allowed states of tokenState for calling Decode() are tokenTopValue, tokenArrayStart, tokenArrayValue, tokenObjectValue, i.e. "full" value, not part of it (link).
To avoid this you can just don't call Token() at all and do something like this:
dec := json.NewDecoder(strings.NewReader(dataMapFromJson))
objects := make(map[string]*ReadObject)
if err := dec.Decode(&objects); err != nil {
log.Fatalf("failed to parse next symbol: %v", err)
}
fmt.Printf("OBJECTS: \n%v\n", objects)
Or, if you want to read chunk-by-chunk, you could keep calling Token() until you reach "full" value. And then call Decode() on this value (I guess this should work).
After consuming the initial { with your first call to dec.Token(), you must :
use dec.Token() to extract the next key
after extracting the key, you can call dec.Decode(&nextObject) to decode an entry
example code :
for dec.More() {
key, err := dec.Token()
if err != nil {
// handle error
}
var val interface{}
err = dec.Decode(&val)
if err != nil {
// handle error
}
fmt.Printf(" %s : %v\n", key, val)
}
https://play.golang.org/p/5r1d8MsNlKb

How to read a JSON object in Go without decoding it (for use in reading a large stream)

I am reading JSON in response to an HTTP endpoint and would like to extract the contents of an array of objects which is nested inside. The response can be large so I am trying to use a streaming approach instead of just json.Unmarshal'ing the whole thing. The JSON looks like so:
{
"useless_thing_1": { /* etc */ },
"useless_thing_2": { /* etc */ },
"the_things_i_want": [
{ /* complex object I want to json.Unmarshal #1 */ },
{ /* complex object I want to json.Unmarshal #2 */ },
{ /* complex object I want to json.Unmarshal #3 */ },
/* could be many thousands of these */
],
"useless_thing_3": { /* etc */ },
}
The json library provided with Go has json.Unmarshal which works well for complete JSON objects. It also has json.Decoder which can unmarshal full objects or provide individual tokens. I can use this tokenizer to carefully go through and extract things but the logic to do so is somewhat complex and I cannot then easily still use json.Unmarshal on the object after I've read it as tokens.
The json.Decoder is buffered which makes it difficult to read one object (i.e. { /* complex object I want to json.Unmarshal #1 */ }) and then consume the , myself and make a new json.Decoder - because it will try to consume the comma itself. This is the approach I tried and haven't been able to make work.
I'm looking for a better solution to this problem. Here is the broken code when I tried to manually consume the commas:
// code here that naively looks for `"the_things_i_want": [` and
// puts the next bytes after that in `buffer`
// this is the rest of the stream starting from `{ /* complex object I want to json.Unmarshal #1 */ },`
in := io.MultiReader(buffer, res.Body)
dec := json.NewDecoder(in)
for {
var p MyComplexThing
err := dec.Decode(&p)
if err != nil {
panic(err)
}
// steal the comma from in directly - this does not work because the decoder buffer's its input
var b1 [1]byte
_, err = io.ReadAtLeast(in, b1[:], 1) // returns random data from later in the stream
if err != nil {
panic(err)
}
switch b1[0] {
case ',':
// skip over it
case ']':
break // we're done
default:
panic(fmt.Errorf("Unexpected result from read %#v", b1))
}
}
Use Decoder.Token and Decoder.More to decode a JSON document as a stream.
Walk through the document with Decoder.Token to the JSON value of interest. Call Decoder.Decode unmarshal the JSON value to a Go value. Repeat as needed to slurp up all values of interest.
Here's some code with commentary explaining how it works:
func decode(r io.Reader) error {
d := json.NewDecoder(r)
// We expect that the JSON document is an object.
if err := expect(d, json.Delim('{')); err != nil {
return err
}
// While there are fields in the object...
for d.More() {
// Get field name
t, err := d.Token()
if err != nil {
return err
}
// Skip value if not the field that we are looking for.
if t != "the_things_i_want" {
if err := skip(d); err != nil {
return err
}
continue
}
// We expect JSON array value for the field.
if err := expect(d, json.Delim('[')); err != nil {
return err
}
// While there are more JSON array elements...
for d.More() {
// Unmarshal and process the array element.
var m map[string]interface{}
if err := d.Decode(&m); err != nil {
return err
}
fmt.Printf("found %v\n", m)
}
// We are done decoding the array.
return nil
}
return errors.New("things I want not found")
}
// skip skips the next value in the JSON document.
func skip(d *json.Decoder) error {
n := 0
for {
t, err := d.Token()
if err != nil {
return err
}
switch t {
case json.Delim('['), json.Delim('{'):
n++
case json.Delim(']'), json.Delim('}'):
n--
}
if n == 0 {
return nil
}
}
}
// expect returns an error if the next token in the document is not expectedT.
func expect(d *json.Decoder, expectedT interface{}) error {
t, err := d.Token()
if err != nil {
return err
}
if t != expectedT {
return fmt.Errorf("got token %v, want token %v", t, expectedT)
}
return nil
}
Run it on the playground.

Go openweathermap forecast return type

I am new to go and I am trying to build a little weather app using OpenWeatherMap
and the go-package by briandowns.
I have no problem with reading the current weather
but I have trouble processing the results of the forecast methods.
func main() {
apiKey := "XXXX"
w, err := owm.NewForecast("5", "C", "en", apiKey)
if err != nil {
log.Fatal(err)
}
w.DailyByName("London", 1)
data := w.ForecastWeatherJson
fmt.Println(data)
}
where the apiKey needs to be replaced by a valid one (which one can get for free upon registration).
My problem is to extract the information from the ForecastWeatherJson.
It is defined as:
type ForecastWeatherJson interface {
Decode(r io.Reader) error
}
in the forecast.go file.
With Decode defined as:
func (f *Forecast5WeatherData) Decode(r io.Reader) error {
if err := json.NewDecoder(r).Decode(&f); err != nil {
return err
}
return nil
}
in forecast5.go.
I really do not know where to start as I did not find a documented example which showed processing the data except for other languages (so I guess it s a go specific problem).
I saw how it can be done in e.g. python but in the go case the return type is not clear to me.
Any hints or links to examples are appreciated.
Data that you need are already decoded in you w param, but you need to type assert to correct Weather type. In your case because you are using type=5 you should use owm.Forecast5WeatherData. Then your main will look like this.
func main() {
apiKey := "XXXX"
w, err := owm.NewForecast("5", "C", "en", apiKey)
if err != nil {
log.Fatal(err)
}
w.DailyByName("London", 3)
if val, ok := w.ForecastWeatherJson.(*owm.Forecast5WeatherData); ok {
fmt.Println(val)
fmt.Println(val.City)
fmt.Println(val.Cnt)
}
}

Determine a JSON tag efficiently

I have a bunch of JSON files, each containing a very large array of complex data. The JSON files look something like:
ids.json
{
"ids": [1,2,3]
}
names.json:
{
"names": ["Tyrion","Jaime","Cersei"]
}
and so on. (In reality, the array elements are complex struct objects with 10s of fields)
I want to extract just the tag that specifies what kind of array it contains. Currently I'm using encoding/json to unmarshal the whole file into a map[string]interface{} and iterate through the map but that is too costly an operation.
Is there a faster way of doing this, preferably without the involvement of unmarshaling entire data?
You can offset the reader right after the opening curly brace then use json.Decoder to decode only the first token from the reader
Something along these lines
sr := strings.NewReader(`{
"ids": [1,2,3]
}`)
for {
b, err := sr.ReadByte()
if err != nil {
fmt.Println(err)
return
}
if b == '{' {
break
}
}
d := json.NewDecoder(sr)
var key string
err := d.Decode(&key)
if err != nil {
fmt.Println(err)
return
}
fmt.Println(key)
https://play.golang.org/p/xJJEqj0tFk9
Additionally you may wrap your io.Reader you obtained from open with bufio.Reader to avoid multiple single-byte writes
This solution assumes contents is a valid JSON object. Not that you could avoid that anyway.
I had a play around with Decoder.Token() reading one token at a time (see this example, line 87), and this works to extract your array label:
const jsonStream = `{
"ids": [1,2,3]
}`
dec := json.NewDecoder(strings.NewReader(jsonStream))
t, err := dec.Token()
if err != nil {
log.Fatal(err)
}
fmt.Printf("First token: %v\n", t)
t, err = dec.Token()
if err != nil {
log.Fatal(err)
}
fmt.Printf("Second token (array label): %v\n", t)