Unmarshal JSON preserving null values - json

My scenario is the follow:
I have a server/worker made with Go. In a background routine, the server receives messages in JSON format, and then updates a MongoDB database with this data.
One of the problems is, some of the MongoDB data types, such as ObjectId and Date, are often converted to string when they must be represented as JSON, so before inserting that data into the database, I unmarshall that JSON in a structure, and then send that structure to the MongoDB driver. The structure implements methods such as UnmarshalJSON and MarshalBSONValue, so their data types are preserved.
Great, everything is solved. But by using structures I get another problem, supposing I have the following structure:
type Integers struct {
Foo *int `json:"foo" bson:"foo"`
Bar *int `json:"bar" bson:"foo"`
Baz *int `json:"baz" bson:"foo"`
}
And then i receive the following JSON:
{"foo": 0, "bar": null}
With this JSON, I should be updating my database with foo = 0, bar = null, and ignore baz. However, if I unmarshall this JSON in my structure, I'll have the equivalent of:
Integers{
Foo: 1,
Bar: nil,
Baz: nil,
}
But with this I can't tell if I received bar and baz, or they just defaulted to nil, so I can't properly update the database.
How I believe it could be solved:
By having the following structure:
type Integers struct {
Foo SmartassInt `json:"foo,omitempty" bson:"foo,omitempty"`
Bar SmartassInt `json:"bar,omitempty" bson:"bar,omitempty"`
Baz SmartassInt `json:"baz,omitempty" bson:"baz,omitempty"`
}
I would be able to differentiate between a null, and a non-received value, as the following example:
var foo int = 0
var fooPointer *int = &foo
var barPointer *int = nil
integers := Integers{
Foo: &fooPointer,
Bar: &barPointer,
Baz: nil,
}
With this structure, baz will not be inserted in the database, as its value is nil, and nil is ignored thanks to the flag omitempty. bar however is not nil, but it points to nil, which is different from being empty, so it's properly inserted as null in the database.
But how can I achieve this initialization with with the received JSON?
The standard JSON unmarshaller would initialize both bar and baz as nil.
Implementing my own marshaller methods, such as
type NullableInt **int
func (i NullableInt) MarshalJSON() ([]byte, error) {
}
func (i NullableInt) UnmarshalJSON(data []byte) error {
}
Is not possible either, since NullableInt is a pointer, and I can't implement methods on pointers.
So, which approach could I use to solve this problem?

On the decode side, you can write a custom unmarshaler for a custom type:
type MaybeInt struct {
Present bool
Null bool
Value int64
}
func (m *MaybeInt) UnmarshalJSON(data []byte) error {
s := string(data)
m.Present = true
if s == "null" {
m.Null = true
return nil
}
v, err := strconv.ParseInt(s, 10, 64)
m.Value = v
return err
}
Complete example here. Unfortunately, this doesn't work on the encode side: there is no way for the MarshalJSON handler to indicate that the field is empty. The obvious way would be to return nil, nil from a Marshaler, but that doesn't work. Neither does returning []byte{}, nil.
You might suppose: well, let's use a pointer, and set it to nil when we want to say that the field should be omitted. This works on the decode side, but now the encode side fails, because the encoder sees the literal null and doesn't call our encoder at all!
Ultimately, we can combine both techniques: read into MaybeInt, encode (write) from *MaybeInt. We'll need parallel struct types. We can set the output type based on the input type. I don't claim this to be pretty, and the reflect code in it is terrible (you can also see all my debug tracery), but this actually seems to work: Playground link. In practice, instead of using reflect, you might just write a function for each case of a "maybe" value.

Related

Unmarshall JSON into a Generic Struct [duplicate]

I'm new to golang generics and have the following setup.
I've gathered loads of different kinds of reports.
Each report has enclosing fields
So I wrapped it in a ReportContainerImpl
I've used a type argument of [T Reportable] where the Reportable is defined as follows
type Reportable interface {
ExportDataPointReport | ImportDataPointReport | MissingDataPointReport | SensorThresoldReport
}
Each of the type in the type constraint is structs that is to be embedded in the container.
type ReportContainerImpl[T Reportable] struct {
LocationID string `json:"lid"`
Provider string `json:"pn"`
ReportType ReportType `json:"m"`
Body T `json:"body"`
}
I use a discriminator ReportType to determine the concrete type when Unmarshal.
type ReportType string
const (
ReportTypeExportDataPointReport ReportType = "ExportDataPointReport"
ReportTypeImportDataPointReport ReportType = "ImportDataPointReport"
ReportTypeMissingDataPointReport ReportType = "MissingDataPointReport"
ReportTypeSensorThresoldReport ReportType = "SensorThresoldReport"
)
Since go does not support type assertion for struct (only interfaces) it is not possible to cast the type when Unmarshal. Also go does not support pointer to the "raw" generic type. Hence, I've created a interface that the ReportContainerImpl implements.
type ReportContainer interface {
GetLocationID() string
GetProvider() string
GetReportType() ReportType
GetBody() interface{}
}
The problem I then get is that I cannot do type constrains on the return type in any form or shape and am back at "freetext semantics" on the GetBody() function to allow for type assertion when Unmarshal is done.
container, err := UnmarshalReportContainer(data)
if rep, ok := container.GetBody().(ExportDataPointReport); ok {
// Use the ReportContainerImpl[ExportDataPointReport] here...
}
Maybe I'm getting this wrong? - but however I do this, I always end up with somewhere needs a interface{} or to know the exact type before Unmarshal
Do you have a better suggestion how to solve this in a type (safer) way?
Cheers,
Mario :)
For completeness I add the UnmarshalReportContainer here
func UnmarshalReportContainer(data []byte) (ReportContainer, error) {
type Temp struct {
LocationID string `json:"lid"`
Provider string `json:"pn"`
ReportType ReportType `json:"m"`
Body *json.RawMessage `json:"body"`
}
var temp Temp
err := json.Unmarshal(data, &temp)
if err != nil {
return nil, err
}
switch temp.ReportType {
case ReportTypeExportDataPointReport:
var report ExportDataPointReport
err := json.Unmarshal(*temp.Body, &report)
return &ReportContainerImpl[ExportDataPointReport]{
LocationID: temp.LocationID,
Provider: temp.Provider,
ReportType: temp.ReportType,
Body: report,
}, err
// ...
}
}
but however I do this, I always end up with somewhere needs a interface{} or to know the exact type before Unmarshal
Precisely.
The concrete types needed to instantiate some generic type or function like ReportContainerImpl or UnmarshalReportContainer must be known at compile time, when you write the code. JSON unmarshalling instead occurs at run-time, when you have the byte slice populated with the actual data.
To unmarshal dynamic JSON based on some discriminatory value, you still need a switch.
Do you have a better suggestion how to solve this in a type (safer) way?
Just forgo parametric polymorphism. It's not a good fit here. Keep the code you have now with json.RawMessage, unmarshal the dynamic data conditionally in the switch and return the concrete structs that implement ReportContainer interface.
As a general solution — if, and only if, you can overcome this chicken-and-egg problem and make type parameters known at compile time, you can write a minimal generic unmarshal function like this:
func unmarshalAny[T any](bytes []byte) (*T, error) {
out := new(T)
if err := json.Unmarshal(bytes, out); err != nil {
return nil, err
}
return out, nil
}
This is only meant to illustrate the principle. Note that json.Unmarshal already accepts any type, so if your generic function actually does nothing except new(T) and return, like in my example, it is no different than "inlining" the entire thing as if unmarshalAny didn't exist.
v, err := unmarshalAny[SomeType](src)
functionally equivalent as
out := &SomeType{}
err := json.Unmarshal(bytes, out)
If you plan to put more logic in unmarshalAny, its usage may be warranted. Your mileage may vary; in general, don't use type parameters when it's not actually necessary.

Unmarshal remaining JSON after performing custom unmarshalling

I have a JSON object That contains an implementation of an interface within it. I'm attempting to take that JSON and marshal it into a struct whilst creating the implementation of the interface.
I've managed to get it to implement the interface with a custom JSON unmarshal function however I'm struggling to piece together how to then marshal the rest of the fields
I've created an example in the Go playground
https://play.golang.org/p/ztF7H7etdjM
My JSON being passed into my application is
{
"address":"1FYuJ4MsVmpzPoFJ6svJMJfygn91Eubid9",
"nonce":13,
"network_id":"qadre.demo.balance",
"challenge":"f2b19e71876c087e681fc092ea3a34d5680bbfe772e40883563e1d5513bb593f",
"type":"verifying_key",
"verifying_key":{
"verifying_key":"3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29",
"fqdn":"huski.service.key"
},
"signature":"a3bf8ee202a508d5a5632f50b140b70b7095d8836493dc7ac4159f6f3350280078b3a58b2162a240bc8c7485894554976a9c7b5d279d3f5bf49fec950f024e02",
"fqdn":"huski.service.SingleKeyProof"
}
I've attempted to do a json.Unmarshal and pass in a new struct for the remaining fields however it seems to put me in an infinite loop, my application hangs and then crashes
The best solution I've come up with so far is to marshal the JSON into a `map[string]interface{} and do each field separately, this feels very clunky though
var m map[string]interface{}
if err := json.Unmarshal(data, &m); err != nil {
return err
}
ad, ok := m["address"]
if ok {
s.Address = ad.(string)
}
fqdn, ok := m["fqdn"]
if ok {
s.FQDN = fqdn.(string)
}
n, ok := m["nonce"]
if ok {
s.Nonce = int64(n.(float64))
}
c, ok := m["challenge"]
if ok {
s.Challenge = []byte(c.(string))
}
network, ok := m["network_id"]
if ok {
s.NetworkID = network.(string)
}
sig, ok := m["signature"]
if ok {
s.Signature = []byte(sig.(string))
}
The reason your code gets into an infinite loop when you try to unmarshal the rest of the fields is because, I presume, the implementation of UnmarshalJSON after its done unmarshaling the verifying key, calls json.Unmarshal with the receiver, which in turn calls the UnmarshalJSON method on the receiver and so they invoke each other ad infinitum.
What you can do is to create a temporary type using the existing type as its definition, this will "keep the structure" but "drop the methods", then unmarshal the rest of the fields into an instance of the new type, and, after unmarshal is done, convert the instance to the original type and assign that to the receiver.
While this fixes the infinite loop, it also re-introduces the original problem of json.Unmarshal not being able to unmarshal into a non-empty interface type. To fix that you can embed the new type in another temporary struct that has a field with the same json tag as the problematic field which will cause it to be "overshadowed" while json.Unmarshal is doing its work.
type SingleKey struct {
FQDN string `json:"fqdn"`
Address string `json:"address"`
Nonce int64 `json:"nonce"`
Challenge []byte `json:"challenge"`
NetworkID string `json:"network_id"`
Type string `json:"type"`
VerifyingKey PublicKey `json:"verifying_key"`
Signature []byte `json:"signature"`
}
func (s *SingleKey) UnmarshalJSON(data []byte) error {
type _SingleKey SingleKey
var temp struct {
RawKey json.RawMessage `json:"verifying_key"`
_SingleKey
}
if err := json.Unmarshal(data, &temp); err != nil {
return err
}
*s = SingleKey(temp._SingleKey)
switch s.Type {
case "verifying_key":
s.VerifyingKey = &PublicKeyImpl{}
// other cases ...
}
return json.Unmarshal([]byte(temp.RawKey), s.VerifyingKey)
}
https://play.golang.org/p/L3gdQZF47uN
Looking at what you've done in your custom unmarshalling function, you seem to be passing in a map with the name of fields as index, and the reflect.Type you want to unmarshal said value into. That, to me, suggests that the keys might be different for different payloads, but that each key has a distinct type associated with it. You can perfectly handle data like this with a simple wrapper type:
type WrappedSingleKey struct {
FQDN string `json:"fqdn"`
Address string `json:"address"`
Nonce int64 `json:"nonce"`
Challenge []byte `json:"challenge"`
NetworkID string `json:"network_id"`
Type string `json:"type"`
VerifyingKey json.RawMessage `json:"verifying_key"`
OtherKey json.RawMessage `json:"other_key"`
Signature []byte `json:"signature"`
}
type SingleKey struct {
FQDN string `json:"fqdn"`
Address string `json:"address"`
Nonce int64 `json:"nonce"`
Challenge []byte `json:"challenge"`
NetworkID string `json:"network_id"`
Type string `json:"type"`
VerifyingKey *PublicKey `json:"verifying_key,omitempty"`
OtherType *OtherKey `json:"other_key,omitempty"`
Signature []byte `json:"signature"`
}
So I've changed the type of your VerifyingKey field to a json.RawMessage. That's basically telling json.Unmarshal to leave that as raw JSON input. For every custom/optional field, add a corresponding RawMessage field.
In the unwrapped type, I've changed VerifyingKey to a pointer and added the omitempty bit to the tag. That's just to accomodate mutliple types, and not have to worry about custom marshalling to avoid empty fields, like the included OtherType field I have. To get what you need, then:
func (s *SingleKey) UnmarshalJSON(data []byte) error {
w := WrappedSingleKey{} // create wrapped instance
if err := json.Unmarshal(data, &w); err != nil {
return err
}
switch w.Type {
case "verifying_key":
var pk PublicKey
if err := json.Unmarshal([]byte(w.VerifyingKey), &pk); err != nil {
return err
}
s.VerifyingKey = &pk // assign
case "other_key":
var ok OtherKey
if err := json.Unmarshal([]byte(w.OtherKey), &ok); err != nil {
return err
}
s.OtherKey = &ok
}
// copy over the fields that didn't require anything special
s.FQDN = w.FQDN
s.Address = w.Address
}
This is a fairly simple approach, does away with the reflection, tons of functions, and is quite commonly used. It's something that lends itself quite well to code generation, too. The individual assignment of the fields is a bit tedious, though. You might think that you can solve that by embedding the SingleKey type into the wrapper, but be careful: this will recursively call your custom unmarshaller function.
You could, for example, update all the fields in the WRapped type to be pointers, and have them point to fields on your actual type. That does away with the manual copying of fields... It's up to you, really.
Note
I didn't test this code, just wrote it as I went along. It's something I've used in the past, and I believe what I wrote here should work, but no guarantees (as in: you might need to debug it a bit)

Why Is Unmarshal Failing With A Nested Struct?

I am trying to retrieve information using Reddit's API. Here is some documentation on their json response, however, I got most of my information by just viewing the link in the browser and pretty-printing the response here.
The following code behaves as intended when the "Replies" field is commented out, but fails when it's not.
[edit] getData() is a function I wrote that uses Go's http Client to get a site response in bytes.
type redditThing struct {
Data struct {
Children []struct {
Data struct {
Permalink string
Subreddit string
Title string
Body string
Replies redditThing
}
}
}
}
func visitLink(link string) {
println("visiting:", link)
var comments []redditThing
if err := json.Unmarshal(getData(link+".json?raw_json=1"), &comments); err != nil {
logError.Println(err)
return
}
}
This throws the following error
json: cannot unmarshal string into Go struct field .Data.Children.Data.Replies.Data.Children.Data.Replies.Data.Children.Data.Replies of type main.redditThing
Any help would be greatly appreciated. Thank you all in advance!
[edit] here a link to some data causing the program to fail
The replies field can be the empty string or a redditThing. Fix by adding an Unmarshal function to handle the empty string:
func (rt *redditThing) UnmarshalJSON(data []byte) error {
// Do nothing if data is the empty string.
if bytes.Equal(data, []byte(`""`)) {
return nil
}
// Prevent recursion by declaring type x with
// same underlying type as redditThing, but
// with no methods.
type x redditThing
return json.Unmarshal(data, (*x)(rt))
}
The x type is used to prevent indefinite recursion. If the final line of the method is json.Unmarshal(data, rt), then json.Unmarshal function will call redditThing.UnmarshalJSON method which calls json.Unmarshal function and so on. Boom!
The statement type x redditThing declares a new type named x with the same underlying type as redditThing. The underlying type is a anonymous struct type. The underlying type has no methods, and crucially, the underlying type does not have the UnmarshalJSON method. This prevents recursion.

Golang Null Types and json.Decode()

I have not been able to find a way around this issue currently. If I have a structure i would like to populate with json from a http.Request I have no way to tell for instance what value was actually passed in for some values. For instance if I pass in an empty json object and run json.Decode on a structure that looks like this...
var Test struct {
Number int `json:"number"`
}
I now have a json object that supposedly was passed with a key of number and a value of zero when in fact I would rather have this return nothing at all. Does go provide another method that would actually allow me to see what JSON has been passed in or not.
Sorry for the rambling I have been trying to figure out how to to this for a few days now and it's driving me nuts.
Thanks for any help.
Edit:
I made this to depict exactly what I am talking about http://play.golang.org/p/aPFKSvuxC9
You could use pointers, for example:
func main() {
var jsonBlob = []byte(`[
{"Name": "Platypus"},
{"Name": "Quoll", "Order": 100}
]`)
type Animal struct {
Name string
Order *int
}
var animals []Animal
err := json.Unmarshal(jsonBlob, &animals)
if err != nil {
fmt.Println("error:", err)
}
for _, a := range animals {
if a.Order != nil {
fmt.Printf("got order, %s : %d\n", a.Name, *a.Order)
}
}
}
I don't see how you could do this by giving a struct to the Unmarshal function. With the following structure for instance:
type A struct {
Hello string
Foo int
Baz string
}
var a A
json.Unmarshal(data, &a)
Even by doing another implementation of Unmarshal, there would be only two (simple) possibilities:
If baz is not in the json data, set a.Baz to a default value, compatible with its type: the empty string (or 0 if it's an integer). This is the current implementation.
If baz is not in the json data, return an error. That would be very inconvenient if the absence of baz is a normal behaviour.
Another possibility would be to use pointers, and use the default value nil in the same spirit than the default value I talked about, but there would still be issue if your json file could be filled with null values: you would not be able to distinguish values that were in the json file, but set as null, and values that were not in the json, and unmarshalled with nil as their default value.
However, this solution might suit you: instead of using a struct, why not using a map[string]interface{} ? The Unmarshall function would not have to add a default value to non-present fields, and it would be able to retrieve any type of data from the json file.
var b = []byte(`[{"Name": "Platypus"}, {"Name": "Quoll", "Order": 100}]`)
var m []map[string]interface{}
err := json.Unmarshal(b, &m)
fmt.Println(m)
// [map[Name:Platypus] map[Name:Quoll Order:100]]

How to not marshal an empty struct into JSON with Go?

I have a struct like this:
type Result struct {
Data MyStruct `json:"data,omitempty"`
Status string `json:"status,omitempty"`
Reason string `json:"reason,omitempty"`
}
But even if the instance of MyStruct is entirely empty (meaning, all values are default), it's being serialized as:
"data":{}
I know that the encoding/json docs specify that "empty" fields are:
false, 0, any nil pointer or interface value, and any array,
slice, map, or string of length zero
but with no consideration for a struct with all empty/default values. All of its fields are also tagged with omitempty, but this has no effect.
How can I get the JSON package to not marshal my field that is an empty struct?
As the docs say, "any nil pointer." -- make the struct a pointer. Pointers have obvious "empty" values: nil.
Fix - define the type with a struct pointer field:
type Result struct {
Data *MyStruct `json:"data,omitempty"`
Status string `json:"status,omitempty"`
Reason string `json:"reason,omitempty"`
}
Then a value like this:
result := Result{}
Will marshal as:
{}
Explanation: Notice the *MyStruct in our type definition. JSON serialization doesn't care whether it is a pointer or not -- that's a runtime detail. So making struct fields into pointers only has implications for compiling and runtime).
Just note that if you do change the field type from MyStruct to *MyStruct, you will need pointers to struct values to populate it, like so:
Data: &MyStruct{ /* values */ }
As #chakrit mentioned in a comment, you can't get this to work by implementing json.Marshaler on MyStruct, and implementing a custom JSON marshalling function on every struct that uses it can be a lot more work. It really depends on your use case as to whether it's worth the extra work or whether you're prepared to live with empty structs in your JSON, but here's the pattern I use applied to Result:
type Result struct {
Data MyStruct
Status string
Reason string
}
func (r Result) MarshalJSON() ([]byte, error) {
return json.Marshal(struct {
Data *MyStruct `json:"data,omitempty"`
Status string `json:"status,omitempty"`
Reason string `json:"reason,omitempty"`
}{
Data: &r.Data,
Status: r.Status,
Reason: r.Reason,
})
}
func (r *Result) UnmarshalJSON(b []byte) error {
decoded := new(struct {
Data *MyStruct `json:"data,omitempty"`
Status string `json:"status,omitempty"`
Reason string `json:"reason,omitempty"`
})
err := json.Unmarshal(b, decoded)
if err == nil {
r.Data = decoded.Data
r.Status = decoded.Status
r.Reason = decoded.Reason
}
return err
}
If you have huge structs with many fields this can become tedious, especially changing a struct's implementation later, but short of rewriting the whole json package to suit your needs (not a good idea), this is pretty much the only way I can think of getting this done while still keeping a non-pointer MyStruct in there.
Also, you don't have to use inline structs, you can create named ones. I use LiteIDE with code completion though, so I prefer inline to avoid clutter.
Data is an initialized struct, so it isn't considered empty because encoding/json only looks at the immediate value, not the fields inside the struct.
Unfortunately, returning nil from json.Marshaler doesn't currently work:
func (_ MyStruct) MarshalJSON() ([]byte, error) {
if empty {
return nil, nil // unexpected end of JSON input
}
// ...
}
You could give Result a marshaler as well, but it's not worth the effort.
The only option, as Matt suggests, is to make Data a pointer and set the value to nil.
There is an outstanding Golang proposal for this feature which has been active for over 4 years, so at this point, it is safe to assume that it will not make it into the standard library anytime soon. As #Matt pointed out, the traditional approach is to convert the structs to pointers-to-structs. If this approach is infeasible (or impractical), then an alternative is to use an alternate json encoder which does support omitting zero value structs.
I created a mirror of the Golang json library (clarketm/json) with added support for omitting zero value structs when the omitempty tag is applied. This library detects zeroness in a similar manner to the popular YAML encoder go-yaml by recursively checking the public struct fields.
e.g.
$ go get -u "github.com/clarketm/json"
import (
"fmt"
"github.com/clarketm/json" // drop-in replacement for `encoding/json`
)
type Result struct {
Data MyStruct `json:"data,omitempty"`
Status string `json:"status,omitempty"`
Reason string `json:"reason,omitempty"`
}
j, _ := json.Marshal(&Result{
Status: "204",
Reason: "No Content",
})
fmt.Println(string(j))
// Note: `data` is omitted from the resultant json.
{
"status": "204"
"reason": "No Content"
}