Unmarshal JSON into slice of maps with unique elements - json

I'm unmarshaling some json files into []map[string]string{}
but very often the source is dirty with many repeated equal objects.
The input looks like:
[{"sa1":"8172"},{"sa3":"8175"},{"sa1":"8172"},{"sa3":"8175"},{"sa3":"8175"},{"sa3":"8175"},{"sa1":"8172"},{"sa3":"8175"},{"sa3":"8175"}]
Resulting in:
map[sa1:8172]
([]map[string]string) (len=9 cap=9) {
(map[string]string) (len=1) {
(string) (len=3) "sa1": (string) (len=4) "8172"
},
(map[string]string) (len=1) {
(string) (len=3) "sa3": (string) (len=4) "8175"
},
(map[string]string) (len=1) {
(string) (len=3) "sa1": (string) (len=4) "8172"
},
(map[string]string) (len=1) {
(string) (len=3) "sa3": (string) (len=4) "8175"
},
(map[string]string) (len=1) {
(string) (len=3) "sa3": (string) (len=4) "8175"
},
(map[string]string) (len=1) {
(string) (len=3) "sa3": (string) (len=4) "8175"
},
(map[string]string) (len=1) {
(string) (len=3) "sa1": (string) (len=4) "8172"
},
(map[string]string) (len=1) {
(string) (len=3) "sa3": (string) (len=4) "8175"
},
(map[string]string) (len=1) {
(string) (len=3) "sa3": (string) (len=4) "8175"
}
}
How could I clean the slice of maps to contain only unique elements?

One option is to unmarshal the key value pairs directly into a comparable type, like a struct:
type Elem struct {
k string
v string
}
func (e *Elem) UnmarshalJSON(d []byte) error {
m := map[string]string{}
if err := json.Unmarshal(d, &m); err != nil {
return err
}
for k, v := range m {
e.k = k
e.v = v
return nil
}
return nil
}
Once you can compare the elements individually, you could also wrap that in a collection which filters the elements while unmarshaling. Whether to do this implicitly here, or after the fact is a matter of opinion. It may be a better separation of concerns to make filtering its own method, but I included it in UnmarshalJSON for brevity.
type Elems []Elem
func (e *Elems) UnmarshalJSON(d []byte) error {
tmp := []Elem{}
err := json.Unmarshal(d, &tmp)
if err != nil {
return err
}
seen := map[Elem]bool{}
for _, elem := range tmp {
if seen[elem] {
continue
}
seen[elem] = true
*e = append(*e, elem)
}
return nil
}
Then you can unmarshal into Elems:
elems := Elems{}
err := json.Unmarshal(js, &elems)
if err != nil {
log.Fatal(err)
}
fmt.Println(elems)
Which will give you the two unique pairs: [{sa1 8172} {sa3 8175}]
https://go.dev/play/p/U0iqBAjvz-1

Related

How to handle unmarshaling to a custom interface whose type could only be determined after unmarshaling

I have a json response like this
{
"foo" : "bar",
"object" : {
"type" : "action",
"data" : "somedata"
}
}
Here the object could be one of multiple types. I define the types and have them implement a common interface.
type IObject interface {
GetType() string
}
type Action struct {
Type string `json:"type"`
Data string `json:"data"`
}
func (a Action) GetType() string {
return "action"
}
type Activity struct {
Type string `json:"type"`
Duration int `json:"duration"`
}
func (a Activity) GetType() string {
return "activity"
}
And a response struct
type Response struct {
Foo string `json:"foo"`
Object IObject `json:"object"`
}
As the type information of a struct that implements IObject is contained within the struct, there is no way to learn in without unmarshaling. I also cannot change the structure of the json response. Currently I am dealing with this problem using a custom unmarshaller:
func UnmarshalObject(m map[string]interface{}, object *IObject) error {
if m["type"] == "action" {
b, err := json.Marshal(m)
if err != nil {
return err
}
action := Action{}
if err = json.Unmarshal(b, &action); err != nil {
return err
}
*object = action
return nil
}
if m["type"] == "activity" {
b, err := json.Marshal(m)
if err != nil {
return err
}
activity := Activity{}
if err = json.Unmarshal(b, &activity); err != nil {
return err
}
*object = activity
return nil
}
return errors.New("unknown actor type")
}
func (r *Response) UnmarshalJSON(data []byte) error {
raw := struct {
Foo string `json:"foo"`
Object interface{} `json:"object"`
}{}
err := json.Unmarshal(data, &raw)
if err != nil {
return err
}
r.Foo = raw.Foo
if err = UnmarshalObject(raw.Object.(map[string]interface{}), &r.Object); err != nil
{
return err
}
return nil
}
So what I do is basically
Unmarshall the object into an interface{}
Typecast to map[string]interface{}
Read the "type" value to determine the type
Create a new instance of the determined type
Marshal back to json
Unmarshal again to the new instance of the determined type
Assign the instance to the field
This feels off and I am not comfortable with it. Especially the marshaling/unmarshaling back and forth. Is there a more elegant way to solve this problem?
You can use json.RawMessage.
func (r *Response) UnmarshalJSON(data []byte) error {
var raw struct {
Foo string `json:"foo"`
Object json.RawMessage `json:"object"`
}
if err := json.Unmarshal(data, &raw); err != nil {
return err
}
r.Foo = raw.Foo
var obj struct {
Type string `json:"type"`
}
if err := json.Unmarshal(raw.Object, &obj); err != nil {
return err
}
switch obj.Type {
case "action":
r.Object = new(Action)
case "activity":
r.Object = new(Activity)
}
return json.Unmarshal(raw.Object, r.Object)
}
https://go.dev/play/p/6dqiybS4zNp

Decoding a dynamic field in Go

I get a response from an external API that has a field which can have 2 values:
{"field": []}
or
{"field": {"key1": "value", "key2": "value"}}
I set the struct to be
type Object Struct {
Field map[string]string `json:"field,omitempty"`
}
And then call my own implemented function to decode the response
func decode(response *http.Response) (*Object, error) {
var response Object
err := json.NewDecoder(response.Body).Decode(&response)
if err != nil {
return nil, err
}
return &response, nil
}
But this works only for the second response ( when field not empty is). For the first response I get an error.
you can do a custom marshaler type for the Field. Example:
type keys struct {
Key1 string
Key2 string
}
type mytype struct {
EmptySlice bool
Keys *keys
}
func (m *mytype) UnmarshalJSON(b []byte) error {
if bytes.Equal(b, []byte("[]")) {
m.EmptySlice = true
return nil
}
m.Keys = &keys{}
return json.Unmarshal(b, &m.Keys)
}
type Object struct {
Field mytype `json:"field"`
}
func main() {
input := []string{
`{"field": []}`,
`{"field": {"key1": "value", "key2": "value"}}`,
}
for i, s := range input {
var o Object
err := json.Unmarshal([]byte(s), &o)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%d: %+v\n", i+1, o)
}
}
https://go.dev/play/p/OqSKfUXHFyb
You can use a custom type and implement the UnmarshalJSON interface method on that type.
For example:
type Field struct {
arr []string
m map[string]string
}
func (f *Field) UnmarshalJSON(b []byte) error {
var m map[string]string
err := json.Unmarshal(b, &m)
if err == nil {
f.m = m
return nil
}
var arr []string
err = json.Unmarshal(b, &arr)
if err == nil {
f.arr = arr
return nil
}
return fmt.Errorf("type of property not array or map")
}
https://go.dev/play/p/JuFE--hWAjw

How do I json unmarshal slice inside a slice

I am trying to unmarshal some pretty ugly json but can't figure out how. I have:
package main
import "fmt"
import "encoding/json"
type PublicKey struct {
ID int `json:"id"`
Key string `json:"key"`
MyData []struct {
ID string `json:"id"`
Value int `json:"value"`
}
}
func main() {
b := `[
{
"id": 1,
"key": "my_key"
},
[
{
"id": "some_id",
"value": 12
},
{
"id": "anorther_id",
"value": 13
}
]
]`
var pk []PublicKey
err := json.Unmarshal([]byte(b), &pk)
if err != nil {
fmt.Println(err)
}
fmt.Println(pk)
}
For the result I am getting:
[{1 my_key []} {0 []}]
The second slice is empty when it shouldn't be.
EDIT:
The error I get is:
json: cannot unmarshal array into Go struct field PublicKey.key of type main.PublicKey
https://play.golang.org/p/cztXOchiiS5
That is some truly hideous JSON! I have two approaches to handling the mixed array elements and I like the 2nd one better. Here's the first approach using interface and a type switch:
package main
import (
"encoding/json"
"errors"
"fmt"
)
type PublicKey struct {
ID int `json:"id"`
Key string `json:"key"`
}
type MyData struct {
ID string `json:"id"`
Value int `json:"value"`
}
type MixedData struct {
Key []PublicKey
MyData [][]MyData
}
func (md *MixedData) UnmarshalJSON(b []byte) error {
md.Key = []PublicKey{}
md.MyData = [][]MyData{}
var obj []interface{}
err := json.Unmarshal([]byte(b), &obj)
if err != nil {
return err
}
for _, o := range obj {
switch o.(type) {
case map[string]interface{}:
m := o.(map[string]interface{})
id, ok := m["id"].(float64)
if !ok {
return errors.New("public key id must be an int")
}
pk := PublicKey{}
pk.ID = int(id)
pk.Key, ok = m["key"].(string)
if !ok {
return errors.New("public key key must be a string")
}
md.Key = append(md.Key, pk)
case []interface{}:
a := o.([]interface{})
myData := make([]MyData, len(a))
for i, x := range a {
m, ok := x.(map[string]interface{})
if !ok {
return errors.New("data array contains unexpected object")
}
val, ok := m["value"].(float64)
if !ok {
return errors.New("data value must be an int")
}
myData[i].Value = int(val)
myData[i].ID, ok = m["id"].(string)
if !ok {
return errors.New("data id must be a string")
}
md.MyData = append(md.MyData, myData)
}
default:
// got something unexpected, handle somehow
}
}
return nil
}
func main() {
b := `[
{
"id": 1,
"key": "my_key"
},
[
{
"id": "some_id",
"value": 12
},
{
"id": "another_id",
"value": 13
}
]
]`
m := MixedData{}
err := json.Unmarshal([]byte(b), &m)
if err != nil {
fmt.Println(err)
}
fmt.Println(m)
}
https://play.golang.org/p/g8d_AsH-pYY
Hopefully there aren't any unexpected other elements, but they can be handled similarly.
Here is the second that relies more on Go's internal JSON parsing with the help of json.RawMessage. It makes the same assumptions about the contents of the array. It assumes that any objects will Unmarshal into PublicKey instances and any arrays consist of only MyData instances. I also added how to marshal back into the target JSON for symmetry:
package main
import (
"encoding/json"
"fmt"
"os"
)
type PublicKey struct {
ID int `json:"id"`
Key string `json:"key"`
}
type MyData struct {
ID string `json:"id"`
Value int `json:"value"`
}
type MixedData struct {
Keys []PublicKey
MyData [][]MyData
}
func (md *MixedData) UnmarshalJSON(b []byte) error {
md.Keys = []PublicKey{}
md.MyData = [][]MyData{}
obj := []json.RawMessage{}
err := json.Unmarshal([]byte(b), &obj)
if err != nil {
return err
}
for _, o := range obj {
switch o[0] {
case '{':
pk := PublicKey{}
err := json.Unmarshal(o, &pk)
if err != nil {
return err
}
md.Keys = append(md.Keys, pk)
case '[':
myData := []MyData{}
err := json.Unmarshal(o, &myData)
if err != nil {
return err
}
md.MyData = append(md.MyData, myData)
default:
// got something unexpected, handle somehow
}
}
return nil
}
func (md *MixedData) MarshalJSON() ([]byte, error) {
out := make([]interface{}, len(md.Keys)+len(md.MyData))
i := 0
for _, x := range md.Keys {
out[i] = x
i++
}
for _, x := range md.MyData {
out[i] = x
i++
}
return json.Marshal(out)
}
func main() {
b := `[
{
"id": 1,
"key": "my_key"
},
[
{
"id": "some_id",
"value": 12
},
{
"id": "another_id",
"value": 13
}
]
]`
m := MixedData{}
err := json.Unmarshal([]byte(b), &m)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
fmt.Println(m)
enc := json.NewEncoder(os.Stdout)
enc.SetIndent("", " ")
if err := enc.Encode(m); err != nil {
fmt.Println(err)
os.Exit(1)
}
}
https://play.golang.org/p/ryZzaWKNcN0
Here's an approach that combines json.RawMessage with the trick of using the default unmarshaler in a type that implements json.Unmarshaler by creating a new temporary type that aliases the target type.
The idea is that we unmarshal the incoming array into a raw message and ensure that the array length is what we expect. Then we unmarshal the individual array elements into the custom struct types using their JSON tag annotations. The end result is that we can unmarshal the PublicKey type in the usual way and the UnmarshalJSON code is not terribly difficult to follow once you understand the tricks.
For example (Go Playground):
type PublicKey struct {
ID int `json:"id"`
Key string `json:"key"`
Data []MyData
}
type MyData struct {
ID string `json:"id"`
Value int `json:"value"`
}
func (pk *PublicKey) UnmarshalJSON(bs []byte) error {
// Unmarshal into a RawMessage so we can inspect the array length.
var rawMessage []json.RawMessage
err := json.Unmarshal(bs, &rawMessage)
if err != nil {
return err
}
if len(rawMessage) != 2 {
return fmt.Errorf("expected array of length 2, got %d", len(rawMessage))
}
// Parse the first object as PublicKey using the default unmarshaler
// using a temporary type that is an alias for the target type.
type PublicKey2 PublicKey
var pk2 PublicKey2
err = json.Unmarshal(rawMessage[0], &pk2)
if err != nil {
return err
}
// Parse the second object as []MyData in the usual way.
err = json.Unmarshal(rawMessage[1], &pk2.Data)
if err != nil {
return err
}
// Finally, assign the aliased object to the target object.
*pk = PublicKey(pk2)
return nil
}
func main() {
var pk PublicKey
err := json.Unmarshal([]byte(jsonstr), &pk)
if err != nil {
panic(err)
}
fmt.Printf("%#v\n", pk)
// main.PublicKey{ID:1, Key:"my_key", Data:[]main.MyData{main.MyData{ID:"some_id", Value:12}, main.MyData{ID:"anorther_id", Value:13}}}
}

How to Unmarshal an inconsistent JSON field that can be a string *or* an array of string?

I am having trouble Unmarshalling some Json I don't have control over.
There is one field that 99% of the time is a string but occasionally is an array.
type MyListItem struct {
Date string `json:"date"`
DisplayName string `json:"display_name"`
}
type MyListings struct {
CLItems []MyListItem `json:"myitems"`
}
var mylist MyListings
err = json.Unmarshal(jsn, &mylist)
if err != nil {
fmt.Print("JSON:\n%s\n error:%v\n", string(jsn),err)
return
}
Json is as follows:
{
"date": "30 Apr",
"display_name": "Mr Smith"
},
{
"date": "30 Apr",
"display_name": ["Mr Smith", "Mr Jones"],
}
error: json: cannot unmarshal array into Go struct field MyListItem.display_name of type string
Use json.RawMessage to capture the varying field.
Use the json "-" name to hide the DisplayName field from decoder. The application will fill this field after the top-level JSON is decoded.
type MyListItem struct {
Date string `json:"date"`
RawDisplayName json.RawMessage `json:"display_name"`
DisplayName []string `json:"-"`
}
Unmarshal the top-level JSON:
var li MyListItem
if err := json.Unmarshal(data, &li); err != nil {
// handle error
}
Unmarshal the display name depending on the type of the raw data:
if len(li.RawDisplayName) > 0 {
switch li.RawDisplayName[0] {
case '"':
if err := json.Unmarshal(li.RawDisplayName, &li.DisplayName); err != nil {
// handle error
}
case '[':
var s []string
if err := json.Unmarshal(li.RawDisplayName, &s); err != nil {
// handle error
}
// Join arrays with "&" per OP's comment on the question.
li.DisplayName = strings.Join(s, "&")
}
}
playground example
Incorporate the above into a for loop to handle MyListings:
var listings MyListings
if err := json.Unmarshal([]byte(data), &listings); err != nil {
// handle error
}
for i := range listings.CLItems {
li := &listings.CLItems[i]
if len(li.RawDisplayName) > 0 {
switch li.RawDisplayName[0] {
case '"':
if err := json.Unmarshal(li.RawDisplayName, &li.DisplayName); err != nil {
// handle error
}
case '[':
var s []string
if err := json.Unmarshal(li.RawDisplayName, &s); err != nil {
// handle error
}
li.DisplayName = strings.Join(s, "&")
}
}
}
playground example
If there's more than one place in the data model where a value can be a string or []string, it can be helpful to encapsulate the logic in a type. Parse the JSON data in an implementation of the json.Unmarshaler interface.
type multiString string
func (ms *multiString) UnmarshalJSON(data []byte) error {
if len(data) > 0 {
switch data[0] {
case '"':
var s string
if err := json.Unmarshal(data, &s); err != nil {
return err
}
*ms = multiString(s)
case '[':
var s []string
if err := json.Unmarshal(data, &s); err != nil {
return err
}
*ms = multiString(strings.Join(s, "&"))
}
}
return nil
}
Use it like this:
type MyListItem struct {
Date string `json:"date"`
DisplayName multiString `json:"display_name"`
}
type MyListings struct {
CLItems []MyListItem `json:"myitems"`
}
var listings MyListings
if err := json.Unmarshal([]byte(data), &listings); err != nil {
log.Fatal(err)
}
Playground Example
Here's the code to get the value as a slice of strings instead of as a single string with values joined by &.
type multiString []string
func (ms *multiString) UnmarshalJSON(data []byte) error {
if len(data) > 0 {
switch data[0] {
case '"':
var s string
if err := json.Unmarshal(data, &s); err != nil {
return err
}
*ms = multiString{s}
case '[':
if err := json.Unmarshal(data, (*[]string)(ms)); err != nil {
return err
}
}
}
return nil
}
Playground example.
As an alternative, this builds off of the answer from #ThunderCat but instead of using json.RawMessage, uses interface{} and a type switch:
package main
import (
"encoding/json"
"fmt"
"log"
)
type MyListItem struct {
Date string `json:"date"`
DisplayName string `json:"-"`
RawDisplayName interface{} `json:"display_name"`
}
func (li *MyListItem) UnmarshalJSON(data []byte) error {
type localItem MyListItem
var loc localItem
if err := json.Unmarshal(data, &loc); err != nil {
return err
}
*li = MyListItem(loc)
switch li.RawDisplayName.(type) {
case string:
li.DisplayName = li.RawDisplayName.(string)
case []interface{}:
vals := li.RawDisplayName.([]interface{})
if len(vals) > 0 {
li.DisplayName, _ = vals[0].(string)
for _, v := range vals[1:] {
li.DisplayName += "&" + v.(string)
}
}
}
return nil
}
func test(data string) {
var li MyListItem
if err := json.Unmarshal([]byte(data), &li); err != nil {
log.Fatal(err)
}
fmt.Println(li.DisplayName)
}
func main() {
test(`
{
"date": "30 Apr",
"display_name": "Mr Smith"
}`)
test(`
{
"date": "30 Apr",
"display_name": ["Mr Smith", "Mr Jones"]
}`)
}
playground

JSON Unmarshal Irregular JSON field

I have this code:
type Response struct {
ID string `json:"id"`
Tags Tags `json:"tags,omitempty"`
}
type Tags struct {
Geo []string `json:"geo,omitempty"`
Keyword []string `json:"keyword,omitempty"`
Storm []string `json:"storm,omitempty"`
}
func (t *Tags) UnmarshalJSON(b []byte) (err error) {
str := string(b)
if str == "" {
t = &Tags{}
return nil
}
err = json.Unmarshal(b, t)
if err != nil {
return err
}
return nil
}
Now, my JSON response looks like this:
[{
"id": "/cms/v4/assets/en_US",
"doc": [{
"id": "af02b41d-c2c5-48ec-9dbc-ceed693bdbac",
"tags": {
"geo": [
"DMA:US.740:US"
]
}
},
{
"id": "6a90d9ed-7978-4c18-8e36-c01cf4260492",
"tags": ""
},
{
"id": "32cfd045-98ac-408c-b464-c74e02466339",
"tags": {
"storm": [
"HARVEY - AL092017"
],
"keyword": [
"hurrcane",
"wunderground"
]
}
}
]
}]
Preferably, I'd change the JSON response to be done correctly, but I cannot. Unmarshaling continues to error out (goroutine stack exceeds 1000000000-byte limit). Preferably, I'd rather do this using easyjson or ffjson but doubt it is possible. Suggestions?
Your UnmarshalJSON function calls itself recursively, which will cause the stack to explode in size.
func (t *Tags) UnmarshalJSON(b []byte) (err error) {
str := string(b)
if str == "" {
t = &Tags{}
return nil
}
err = json.Unmarshal(b, t) <--- here it calls itself again
if err != nil {
return err
}
return nil
}
If you have a reason to call json.Unmarshal from within a UnmarshalJSON function, it must be on a different type. A common way to do this is to use a local alias:
type tagsAlias Tags
var ta = &tagsAlias
err = json.Unmarshal(b, ta)
if err != nil {
return err
}
*t = Tags(ta)
Also note that t = &Tags{} does nothing in your function; it assigns a new value to t, but that value is lost as soon as the function exits. If you really want to assign to t, you need *t; but you also don't need that at all, unless you're trying to unsset a previously set instance of *Tags.