Parse JSON having sibling dynamic keys alongside with static in Go - json

I need to parse this json
{
"version": "1.1.29-snapshot",
"linux-amd64": {
"url": "https://origin/path",
"size": 7794688,
"sha256": "14b3c3ad05e3a98d30ee7e774646aec7ffa8825a1f6f4d9c01e08bf2d8a08646"
},
"windows-amd64": {
"url": "https://origin/path",
"size": 8102400,
"sha256": "01b8b927388f774bdda4b5394e381beb592d8ef0ceed69324d1d42f6605ab56d"
}
}
Keys like linux-amd64 are dynamic and theirs amount is arbitrary. I tried something like that to describe it and unmarshal. Obviously it doesn't work. Items is always empty.
type FileInfo struct {
Url string `json:"url"`
Size int64 `json:"size"`
Sha256 string `json:"sha256"`
}
type UpdateInfo struct {
Version string `json:"version"`
Items map[string]FileInfo
}
It's similar to this use case, but has no parent key items. I suppose I can use 3rd party library or map[string]interface{} approach, but I'm interested in knowing how to achieve this with explicitly declared types.
The rest of the parsing code is:
func parseUpdateJson(jsonStr []byte) (UpdateInfo, error) {
var allInfo = UpdateInfo{Items: make(map[string]FileInfo)}
var err = json.Unmarshal(jsonStr, &allInfo)
return allInfo, err
}
Look at the link I attached and you will realize that is not that simple as you think. Also I pointed that I interested in typed approach. Ok, how to declare this map[string]FileInfo to get parsed?

You can create a json.Unmarshaller to decode the json into a map, then apply those values to your struct: https://play.golang.org/p/j1JXMpc4Q9u
type FileInfo struct {
Url string `json:"url"`
Size int64 `json:"size"`
Sha256 string `json:"sha256"`
}
type UpdateInfo struct {
Version string `json:"version"`
Items map[string]FileInfo
}
func (i *UpdateInfo) UnmarshalJSON(d []byte) error {
tmp := map[string]json.RawMessage{}
err := json.Unmarshal(d, &tmp)
if err != nil {
return err
}
err = json.Unmarshal(tmp["version"], &i.Version)
if err != nil {
return err
}
delete(tmp, "version")
i.Items = map[string]FileInfo{}
for k, v := range tmp {
var item FileInfo
err := json.Unmarshal(v, &item)
if err != nil {
return err
}
i.Items[k] = item
}
return nil
}

This answer is adapted from this recipe in my YouTube video on advanced JSON handling in Go.
func (u *UpdateInfo) UnmarshalJSON(d []byte) error {
var x struct {
UpdateInfo
UnmarshalJSON struct{}
}
if err := json.Unmarshal(d, &x); err != nil {
return err
}
var y map[string]json.RawMessage{}
if err := json.Unsmarshal(d, &y); err != nil {
return err
}
delete(y, "version"_ // We don't need this in the map
*u = x.UpdateInfo
u.Items = make(map[string]FileInfo, len(y))
for k, v := range y {
var info FileInfo
if err := json.Unmarshal(v, &info); err != nil {
return err
}
u.Items[k] = info
}
return nil
}
It:
Unmarshals the JSON into the struct directly, to get the struct fields.
It re-unmarshals into a map of map[string]json.RawMessage to get the arbitrary keys. This is necessary since the value of version is not of type FileInfo, and trying to unmarshal directly into map[string]FileInfo will thus error.
It deletes the keys we know we already got in the struct fields.
It then iterates through the map of string to json.RawMessage, and finally unmarshals each value into the FileInfo type, and stores it in the final object.
If you really don't want to unmarshal multiple times, your next best option is to iterate over the JSON tokens in your input by using the json.Decoder type. I've done this in a couple of performance-sensitive bits of code, but it makes your code INCREDIBLY hard to read, and in almost all cases is not worth the effort.

Related

Custom unmarshaling a struct into a map of slices

I thought I understood unmarshalling by now, but I guess not. I'm having a little bit of trouble unmarshalling a map in go. Here is the code that I have so far
type OHLC_RESS struct {
Pair map[string][]Candles
Last int64 `json:"last"`
}
type Candles struct {
Time uint64
Open string
High string
Low string
Close string
VWAP string
Volume string
Count int
}
func (c *Candles) UnmarshalJSON(d []byte) error {
tmp := []interface{}{&c.Time, &c.Open, &c.High, &c.Low, &c.Close, &c.VWAP, &c.Volume, &c.Count}
length := len(tmp)
err := json.Unmarshal(d, &tmp)
if err != nil {
return err
}
g := len(tmp)
if g != length {
return fmt.Errorf("Lengths don't match: %d != %d", g, length)
}
return nil
}
func main() {
response := []byte(`{"XXBTZUSD":[[1616662740,"52591.9","52599.9","52591.8","52599.9","52599.1","0.11091626",5],[1616662740,"52591.9","52599.9","52591.8","52599.9","52599.1","0.11091626",5]],"last":15}`)
var resp OHLC_RESS
err := json.Unmarshal(response, &resp)
fmt.Println("resp: ", resp)
}
after running the code, the last field will unmarshal fine, but for whatever reason, the map is left without any value. Any help?
The expedient solution, for the specific example JSON, would be to NOT use a map at all but instead change the structure of OHLC_RESS so that it matches the structure of the JSON, i.e.
type OHLC_RESS struct {
Pair []Candles `json:"XXBTZUSD"`
Last int64 `json:"last"`
}
https://go.dev/play/p/Z9PhJt3wX33
However it's safe to assume, I think, that the reason you've opted to use a map is because the JSON object's key(s) that hold the "pairs" can vary and so hardcoding them into the field's tag is out of the question.
To understand why your code doesn't produce the desired result, you have to realize two things. First, the order of a struct's fields has no bearing on how the keys of a JSON object will be decoded. Second, the name Pair holds no special meaning for the unmarshaler. Therefore, by default, the unmarshaler has no way of knowing that your wish is to decode the "XXBTZUSD": [ ... ] element into the Pair map.
So, to get your desired result, you can have the OHLC_RESS implement the json.Unmarshaler interface and do the following:
func (r *OHLC_RESS) UnmarshalJSON(d []byte) error {
// first, decode just the object's keys and leave
// the values as raw, non-decoded JSON
var obj map[string]json.RawMessage
if err := json.Unmarshal(d, &obj); err != nil {
return err
}
// next, look up the "last" element's raw, non-decoded value
// and, if it is present, then decode it into the Last field
if last, ok := obj["last"]; ok {
if err := json.Unmarshal(last, &r.Last); err != nil {
return err
}
// remove the element so it's not in
// the way when decoding the rest below
delete(obj, "last")
}
// finally, decode the rest of the element values
// in the object and store them in the Pair field
r.Pair = make(map[string][]Candles, len(obj))
for key, val := range obj {
cc := []Candles{}
if err := json.Unmarshal(val, &cc); err != nil {
return err
}
r.Pair[key] = cc
}
return nil
}
https://go.dev/play/p/Lj8a8Gx9fWH

Go reading map from json stream

I need to parse really long json file (more than million items). I don't want to load it to the memory and read it chunk by chunk. There's a good example with the array of items here. The problem is that I deal with the map. And when I call Decode I get not at beginning of value.
I can't get what should be changed.
const data = `{
"object1": {"name": "cattle","location": "kitchen"},
"object2": {"name": "table","location": "office"}
}`
type ReadObject struct {
Name string `json:"name"`
Location string `json:"location"`
}
func ParseJSON() {
dec := json.NewDecoder(strings.NewReader(data))
tkn, err := dec.Token()
if err != nil {
log.Fatalf("failed to read opening token: %v", err)
}
fmt.Printf("opening token: %v\n", tkn)
objects := make(map[string]*ReadObject)
for dec.More() {
var nextSymbol string
if err := dec.Decode(&nextSymbol); err != nil {
log.Fatalf("failed to parse next symbol: %v", err)
}
nextObject := &ReadObject{}
if err := dec.Decode(&nextObject); err != nil {
log.Fatalf("failed to parse next object")
}
objects[nextSymbol] = nextObject
}
tkn, err = dec.Token()
if err != nil {
log.Fatalf("failed to read closing token: %v", err)
}
fmt.Printf("closing token: %v\n", tkn)
fmt.Printf("OBJECTS: \n%v\n", objects)
}
TL,DR: when you are calling Token() method for a first time, you move offset from the beginning (of a JSON value) and therefore you get the error.
You are working with this struct (link):
type Decoder struct {
// others fields omits for simplicity
tokenState int
}
Pay attention for a tokenState field. This value could be one of (link):
const (
tokenTopValue = iota
tokenArrayStart
tokenArrayValue
tokenArrayComma
tokenObjectStart
tokenObjectKey
tokenObjectColon
tokenObjectValue
tokenObjectComma
)
Let's back to your code. You are calling Token() method. This method obtains first JSON-valid token { and changes tokenState from tokenObjectValue to the tokenObjectStart (link). Now you are "in-an-object" state.
If you try to call Decode() at this point you will get an error (not at beginning of value). This is because allowed states of tokenState for calling Decode() are tokenTopValue, tokenArrayStart, tokenArrayValue, tokenObjectValue, i.e. "full" value, not part of it (link).
To avoid this you can just don't call Token() at all and do something like this:
dec := json.NewDecoder(strings.NewReader(dataMapFromJson))
objects := make(map[string]*ReadObject)
if err := dec.Decode(&objects); err != nil {
log.Fatalf("failed to parse next symbol: %v", err)
}
fmt.Printf("OBJECTS: \n%v\n", objects)
Or, if you want to read chunk-by-chunk, you could keep calling Token() until you reach "full" value. And then call Decode() on this value (I guess this should work).
After consuming the initial { with your first call to dec.Token(), you must :
use dec.Token() to extract the next key
after extracting the key, you can call dec.Decode(&nextObject) to decode an entry
example code :
for dec.More() {
key, err := dec.Token()
if err != nil {
// handle error
}
var val interface{}
err = dec.Decode(&val)
if err != nil {
// handle error
}
fmt.Printf(" %s : %v\n", key, val)
}
https://play.golang.org/p/5r1d8MsNlKb

How do you modify this struct in Golang to accept two different results?

For the following JSON response {"table_contents":[{"id":100,"description":"text100"},{"id":101,"description":"text101"},{"id":1,"description":"text1"}]}
All you have to do is to produce the following code to execute it properly and be able to reads fields from the struct, such as:
package main
import (
"fmt"
"encoding/json"
)
type MyStruct1 struct {
TableContents []struct {
ID int
Description string
} `json:"table_contents"`
}
func main() {
result:= []byte(`{"table_contents":[{"id":100,"description":"text100"},{"id":101,"description":"text101"},{"id":1,"description":"text1"}]}`)
var container MyStruct1
err := json.Unmarshal(result, &container)
if err != nil {
fmt.Println(" [0] Error message: " + err.Error())
return
}
for i := range container.TableContents {
fmt.Println(container.TableContents[i].Description)
}
}
But how do you deal with the following JSON response? {"table_contents":[[{"id":100,"description":"text100"},{"id":101,"description":"text101"}],{"id":1,"description":"text1"}]} You can either get this response or the one above, it is important to modify the struct to accept both.
I did something like this, with the help of internet:
package main
import (
"fmt"
"encoding/json"
)
type MyStruct1 struct {
TableContents []TableContentUnion `json:"table_contents"`
}
type TableContentClass struct {
ID int
Description string
}
type TableContentUnion struct {
TableContentClass *TableContentClass
TableContentClassArray []TableContentClass
}
func main() {
result:= []byte(`{"table_contents":[[{"id":100,"description":"text100"},{"id":101,"description":"text101"}],{"id":1,"description":"text1"}]}`)
var container MyStruct1
err := json.Unmarshal(result, &container)
if err != nil {
fmt.Println(" [0] Error message: " + err.Error())
return
}
for i := range container.TableContents {
fmt.Println(container.TableContents[i])
}
}
but it does not go past the error message :(
[0] Error message: json: cannot unmarshal array into Go struct field MyStruct1.table_contents of type main.TableContentUnion*
Been struggling to come up with a solution for hours. If someone could help I would be happy. Thank you for reading. Let me know if you have questions
Inside table_contents you have two type options (json object or list of json objects). What you can do is to unmarshall into an interface and then run type-check on it when using it:
type MyStruct1 struct {
TableContents []interface{} `json:"table_contents"`
}
...
for i := range container.TableContents {
switch container.TableContents[i].(type){
case map[string]interface{}:
fmt.Println("json object")
case []interface{}:
fmt.Println("list")
}
}
From there you can use some library (e.g. https://github.com/mitchellh/mapstructure) to map unmarshalled struct to your TableContentClass type. See PoC playground here: https://play.golang.org/p/NhVUhQayeL_C
Custom UnmarshalJSON function
You can also create a custom UnmarshalJSON function on the object that has the 2 possibilities. In you case that would be TableContentUnion.
In the custom unmarshaller you can then decide how to unmarshal the content.
func (s *TableContentUnion) UnmarshalJSON(b []byte) error {
// Note that we get `b` as bytes, so we can also manually check to see
// if it is an array (starts with `[`) or an object (starts with `{`)
var jsonObj interface{}
if err := json.Unmarshal(b, &jsonObj); err != nil {
return err
}
switch jsonObj.(type) {
case map[string]interface{}:
// Note: instead of using json.Unmarshal again, we could also cast the interface
// and build the values as in the example above
var tableContentClass TableContentClass
if err := json.Unmarshal(b, &tableContentClass); err != nil {
return err
}
s.TableContentClass = &tableContentClass
case []interface{}:
// Note: instead of using json.Unmarshal again, we could also cast the interface
// and build the values as in the example above
if err := json.Unmarshal(b, &s.TableContentClassArray); err != nil {
return err
}
default:
return errors.New("TableContentUnion.UnmarshalJSON: unknown content type")
}
return nil
}
The rest then works like in your test code that was failing before. Here the working Go Playground
Unmarshal to map and manually build struct
You can always unmarshal a json (with an object at the root) into a map[string]interface{}. Then you can iterate things and further unmarshal them after checking what type they are.
Working example:
func main() {
result := []byte(`{"table_contents":[[{"id":100,"description":"text100"},{"id":101,"description":"text101"}],{"id":1,"description":"text1"}]}`)
var jsonMap map[string]interface{}
err := json.Unmarshal(result, &jsonMap)
if err != nil {
fmt.Println(" [0] Error message: " + err.Error())
return
}
cts, ok := jsonMap["table_contents"].([]interface{})
if !ok {
// Note: nil or missing 'table_contents" will also lead to this path.
fmt.Println("table_contents is not a slice")
return
}
var unions []TableContentUnion
for _, content := range cts {
var union TableContentUnion
if contents, ok := content.([]interface{}); ok {
for _, content := range contents {
contCls := parseContentClass(content)
if contCls == nil {
continue
}
union.TableContentClassArray = append(union.TableContentClassArray, *contCls)
}
} else {
contCls := parseContentClass(content)
union.TableContentClass = contCls
}
unions = append(unions, union)
}
container := MyStruct1{
TableContents: unions,
}
for i := range container.TableContents {
fmt.Println(container.TableContents[i])
}
}
func parseContentClass(value interface{}) *TableContentClass {
m, ok := value.(map[string]interface{})
if !ok {
return nil
}
return &TableContentClass{
ID: int(m["id"].(float64)),
Description: m["description"].(string),
}
}
This is most useful if the json has too many variations. For cases like this it might also make sense sometimes to switch to a json package that works differently like https://github.com/tidwall/gjson which gets values based on their path.
Use json.RawMessage to capture the varying parts of the JSON document. Unmarshal each raw message as appropriate.
func (ms *MyStruct1) UnmarshalJSON(data []byte) error {
// Declare new type with same base type as MyStruct1.
// This breaks recursion in call to json.Unmarshal below.
type x MyStruct1
v := struct {
*x
// Override TableContents field with raw message.
TableContents []json.RawMessage `json:"table_contents"`
}{
// Unmarshal all but TableContents directly to the
// receiver.
x: (*x)(ms),
}
err := json.Unmarshal(data, &v)
if err != nil {
return err
}
// Unmarahal raw elements as appropriate.
for _, tcData := range v.TableContents {
if bytes.HasPrefix(tcData, []byte{'{'}) {
var v TableContentClass
if err := json.Unmarshal(tcData, &v); err != nil {
return err
}
ms.TableContents = append(ms.TableContents, v)
} else {
var v []TableContentClass
if err := json.Unmarshal(tcData, &v); err != nil {
return err
}
ms.TableContents = append(ms.TableContents, v)
}
}
return nil
}
Use it like this:
var container MyStruct1
err := json.Unmarshal(result, &container)
if err != nil {
// handle error
}
Run it on the Go playground.
This approach does not add any outside dependencies. The function code does not need to modified when fields are added or removed from MyStruct1 or TableContentClass.

Custom marshalling to bson and JSON (Golang & mgo)

I have the following type in Golang:
type Base64Data []byte
In order to support unmarshalling a base64 encoded string to this type, I did the following:
func (b *Base64Data) UnmarshalJSON(data []byte) error {
if len(data) == 0 {
return nil
}
content, err := base64.StdEncoding.DecodeString(string(data[1 : len(data)-1]))
if err != nil {
return err
}
*b = []byte(xml)
return nil
}
Now I also want to be able to marshal and unmarshal it to mongo database, using mgo Golang library.
The problem is that I already have documents there stored as base64 encoded string, so I have to maintain that.
I tried to do the following:
func (b Base64Data) GetBSON() (interface{}, error) {
return base64.StdEncoding.EncodeToString([]byte(b)), nil
}
func (b *Base64DecodedXml) SetBSON(raw bson.Raw) error {
var s string
var err error
if err = raw.Unmarshal(&s); err != nil {
return err
}
*b, err = base64.StdEncoding.DecodeString(s)
return err
}
So that after unmarshaling, the data is already decoded, so I need to encode it back, and return it as a string so it will be written to db as a string (and vice versa)
For that I implemented bson getter and setter, but it seems only the getter is working properly
JSON unmarshaling from base64 encoded string works, as well marshaling it to database. but unmarshling setter seems to not be called at all.
Can anyone suggest what I'm missing, so that I'll be able to properly hold the data decoded in memory, but encoded string type?
This is a test I tried to run:
b := struct {
Value shared.Base64Data `json:"value" bson:"value"`
}{}
s := `{"value": "PHJvb3Q+aGVsbG88L3Jvb3Q+"}`
require.NoError(t, json.Unmarshal([]byte(s), &b))
t.Logf("%v", string(b.Value))
b4, err := bson.Marshal(b)
require.NoError(t, err)
t.Logf("%v", string(b4))
require.NoError(t, bson.Unmarshal(b4, &b))
t.Logf("%v", string(b.Value))
You can't marshal any value with bson.Marshal(), only maps and struct values.
If you want to test it, pass a map, e.g. bson.M to bson.Marshal():
var x = Base64Data{0x01, 0x02, 0x03}
dd, err := bson.Marshal(bson.M{"data": x})
fmt.Println(string(dd), err)
Your code works as-is, and as you intend it to. Try to insert a wrapper value to verify it:
c := sess.DB("testdb").C("testcoll")
var x = Base64Data{0x01, 0x02, 0x03}
if err := c.Insert(bson.M{
"data": x,
}); err != nil {
panic(err)
}
This will save the data as a string, being the Base64 encoded form.
Of course if you want to load it back into a value of type Base64Data, you will also need to define the SetBSON(raw Raw) error method too (bson.Setter interface).

Go parse JSON array of array

I have data like this
"descriptionMap": [[[1,2], "a"], [[3,4], "b"]]
and I was trying to decode it with
DescriptionMap []struct {
OpcodeTableIdPair []int
OpcodeDescription string
} `json:"descriptionMap"`
but I keep on getting empty arrays,
[[{[] } {[] }]]
You have a very unfortunate JSON schema which treats arrays as objects. The best you can do in this situation is something like this:
type Body struct {
DescriptionMap []Description `json:"descriptionMap"`
}
type Description struct {
IDPair []int
Description string
}
func (d *Description) UnmarshalJSON(b []byte) error {
arr := []interface{}{}
err := json.Unmarshal(b, &arr)
if err != nil {
return err
}
idPair := arr[0].([]interface{})
d.IDPair = make([]int, len(idPair))
for i := range idPair {
d.IDPair[i] = int(idPair[i].(float64))
}
d.Description = arr[1].(string)
return nil
}
Playground: https://play.golang.org/p/MPho12GJfc.
Notice though that this will panic if any of the types in the JSON don't match. You can create a better version which returns errors in such cases.