Why are string and []bytes treated differently when unmarshaling JSON? - json

My understanding from reading the documentation was that string is essentially an immutable []byte and that one can easily convert between the two.
However when unmarshaling from JSON this doesn't seem to be true. Take the following example program:
package main
import (
"encoding/json"
"fmt"
)
type STHRaw struct {
Hash []byte `json:"hash"`
}
type STHString struct {
Hash string `json:"hash"`
}
func main() {
bytes := []byte(`{"hash": "nuyHN9wx4lZL2L3Ir3dhZpmggTQEIHEZcC3DUNCtQsk="}`)
stringHead := new(STHString)
if err := json.Unmarshal(bytes, &stringHead); err != nil {
return
}
rawHead := new(STHRaw)
if err := json.Unmarshal(bytes, &rawHead); err != nil {
return
}
fmt.Printf("String:\t\t%x\n", stringHead.Hash)
fmt.Printf("Raw:\t\t%x\n", rawHead.Hash)
fmt.Printf("Raw to string:\t%x\n", string(rawHead.Hash[:]))
}
This gives the following output:
String: 6e7579484e397778346c5a4c324c3349723364685a706d67675451454948455a63433344554e437451736b3d
Raw: 9eec8737dc31e2564bd8bdc8af77616699a0813404207119702dc350d0ad42c9
Raw to string: 9eec8737dc31e2564bd8bdc8af77616699a0813404207119702dc350d0ad42c9
Instead I would have expected to receive the same value each time.
What is the difference?

The designers of the encoding/json package made the decision that applications must provide valid UTF-8 text in string values and that applications can put arbitrary byte sequences in []byte values. The package base64 encodes []byte values to ensure that the resulting string is valid UTF-8.
The encoding of []byte values is described in the Marshal function documentation.
This decision was not dictated by the design of the Go language. The string type can contain arbitrary byte sequences. The []byte type can contain valid UTF-8 text.
The designers could have used a flag in the field tag to indicate that a string or []byte value should be encoded and which encoder to use, but that's not what they did.

Related

Replace characters in go serialization by using custom MarshalJSON method

As far as I saw, I just did a customized MarshalJSON method in order to replace these characters:\u003c and \u003e: https://go.dev/play/p/xJ-qcMN9QXl
In the example above, i marshaled the similar struct by sending to marshal from an aux struct that contains the same fields and last step is to replace the fields that I actually need and the return.
As you can see in the print placed before returning from MarshalJSON method, the special characters were replaced, but after calling the json.Marshal func, the special characters remains the same.
Something I'm missing here but cannot figure it out. Appreciate your help.
Thankies :)
In the Marshal documentation of the json package https://pkg.go.dev/encoding/json#Marshal you will find the following paragraph:
String values encode as JSON strings coerced to valid UTF-8, replacing invalid bytes with the Unicode replacement rune. So that the JSON will be safe to embed inside HTML tags, the string is encoded using HTMLEscape, which replaces "<", ">", "&", U+2028, and U+2029 are escaped to "\u003c","\u003e", "\u0026", "\u2028", and "\u2029". This replacement can be disabled when using an Encoder, by calling SetEscapeHTML(false).
So try it using a Encoder, example:
package main
import (
"bytes"
"encoding/json"
"fmt"
)
type Foo struct {
Name string
Surname string
Likes map[string]interface{}
Hates map[string]interface{}
newGuy bool //rpcclonable
}
func main() {
foo := &Foo{
Name: "George",
Surname: "Denkin",
Likes: map[string]interface{}{
"Sports": "volleyball",
"Message": "<Geroge> play volleyball <usually>",
},
}
buf := &bytes.Buffer{} // or &strings.Builder{} as from the example of #mkopriva
enc := json.NewEncoder(buf)
enc.SetEscapeHTML(false)
err := enc.Encode(foo)
if err != nil {
return
}
fmt.Println(buf.String())
}

Is Go able to unmarshal to map[string][]interface{}?

Currently, I try to parse JSON to map[string][]interface{}, but unmarshalling returns an error. According to (https://golang.org/pkg/encoding/json/), to unmarshal JSON into an interface value, Unmarshal stores one of these in the interface value:
bool, for JSON booleans
float64, for JSON numbers
string, for JSON strings
-[]interface{}, for JSON arrays
map[string]interface{}, for JSON objects
nil for JSON null
I wonder if golang is able to unmarshal map[string][]interface{}. The following is code snippet. I am new to Golang, thanks for help in advance.
// emailsStr looks like "{"isSchemaConforming":true,"schemaVersion":0,"unknown.0":[{"email_address":"test1#uber.com"},{"email_address":"test2#uber.com"}]}"
emailsRaw := make(map[string][]*entities.Email)
err := json.Unmarshal([]byte(emailsStr), &emailsRaw)
Error message:
&json.UnmarshalTypeError{Value:"number", Type:(*reflect.rtype)(0x151c7a0), Offset:44, Struct:"", Field:""}
The Go encoding/json package will only unmarshal dynamically to a map[string]interface{}. From there, you will need to use type assertions and casting to pull out the values you want, like so:
func main() {
jsonStr := `{"isSchemaConforming":true,"schemaVersion":0,"unknown.0":[{"email_address":"test1#uber.com"},{"email_address":"test2#uber.com"}]}`
dynamic := make(map[string]interface{})
json.Unmarshal([]byte(jsonStr), &dynamic)
firstEmail := dynamic["unknown.0"].([]interface{})[0].(map[string]interface{})["email_address"]
fmt.Println(firstEmail)
}
(https://play.golang.org/p/VEUEIwj3CIC)
Each time, Go's .(<type>) operator is used to assert and cast the dynamic value to a specific type. This particular code will panic if anything happens to be the wrong type at runtime, like if the contents of unknown.0 aren't an array of JSON objects.
The more idiomatic (and robust) way to do this in Go is to annotate a couple structs with json:"" tags and have encoding/json unmarshal into them. This avoids all the nasty brittle .([]interface{}) type casting:
type Email struct {
Email string `json:"email_address"`
}
type EmailsList struct {
IsSchemaConforming bool `json:"isSchemaConforming"`
SchemaVersion int `json:"schemaVersion"`
Emails []Email `json:"unknown.0"`
}
func main() {
jsonStr := `{"isSchemaConforming":true,"schemaVersion":0,"unknown.0":[{"email_address":"test1#uber.com"},{"email_address":"test2#uber.com"}]}`
emails := EmailsList{}
json.Unmarshal([]byte(jsonStr), &emails)
fmt.Printf("%+v\n", emails)
}
(https://play.golang.org/p/iS6e0_87P2J)
A better approach will be to use struct for main schema and then use an slice of email struct for fetching the data for email entities get the values from the same according to requirements. Please find the solution below :-
package main
import (
"fmt"
"encoding/json"
)
type Data struct{
IsSchemaConforming bool `json:"isSchemaConforming"`
SchemaVersion float64 `json:"schemaVersion"`
EmailEntity []Email `json:"unknown.0"`
}
// Email struct
type Email struct{
EmailAddress string `json:"email_address"`
}
func main() {
jsonStr := `{"isSchemaConforming":true,"schemaVersion":0,"unknown.0":[{"email_address":"test1#uber.com"},{"email_address":"test2#uber.com"}]}`
var dynamic Data
json.Unmarshal([]byte(jsonStr), &dynamic)
fmt.Printf("%#v", dynamic)
}

Preserve json.RawMessage through multiple marshallings

Background
I'm working with JSON data that must be non-repudiable.
The API that grants me this data also has a service to verify that the data originally came from them.
As best as I can tell, they do this by requiring that the complete JSON they originally sent needs to be supplied to them inside another JSON request, with no byte changes.
Issue
I can't seem to preserve the original JSON!
Because I cannot modify the original JSON, I have carefully preserved it as a json.RawMessage when unmarshalling:
// struct I unmarshal my original data into
type SignedResult struct {
Raw json.RawMessage `json:"random"`
Signature string `json:"signature"`
...
}
// struct I marshal my data back into
type VerifiedSignatureReq {
Raw json.RawMessage `json:"random"`
Signature string `json:"signature"`
}
// ... getData is placeholder for function that gets my data
response := SignedResult{}
x, _ := json.Unmarshal(getData(), &response)
// do some post-processing with SignedResult that does not alter `Raw` or `Signature`
// trouble begins here - x.Raw started off as json.RawMessage...
y := json.Marshal(VerifiedSignatureReq{Raw: x.Raw, Signature: x.Signature}
// but now y.Raw is base64-encoded.
The problem is that []bytes / RawMessages are base64-encoded when marshaled. So I can't use this method, because it completely alters the string.
I'm unsure how to ensure this string is correctly preserved. I had assumed that the json.RawMessage specification in my struct would survive the perils of marshaling an already marshaled instance because it implements the Marshaler interface, but I appear mistaken.
Things I've Tried
My next attempt was to try:
// struct I unmarshal my original data into
type SignedResult struct {
Raw json.RawMessage `json:"random"`
Signature string `json:"signature"`
...
}
// struct I marshal my data back into
type VerifiedSignatureReq {
Raw map[string]interface{} `json:"random"`
Signature string `json:"signature"`
}
// ... getData is placeholder for function that gets my data
response := SignedResult{}
x, _ := json.Unmarshal(getData(), &response)
// do some post-processing with SignedResult that does not alter `Raw` or `Signature`
var object map[string]interface{}
json.Unmarshal(x.Raw, &object)
// now correctly generates the JSON structure.
y := json.Marshal(VerifiedSignatureReq{Raw: object, Signature: x.Signature}
// but now this is not the same JSON string as received!
The issue with this approach is that there are minor byte-wise differences in the spacing between the data. It no longer looks exactly the same when catted to a file.
I cannot use string(x.Raw) either because it escapes certain characters when marshaled with \.
You will need a custom type with its own marshaler, in place of json.RawMessage for your VerifiedSignatureReq struct to use. Example:
type VerifiedSignatureReq {
Raw RawMessage `json:"random"`
Signature string `json:"signature"`
}
type RawMessage []byte
func (m RawMessage) MarshalJSON() ([]byte, error) {
return []byte(m), nil
}

How to enforce float in decimal format when encoding to JSON in Go

I have a big.float which I'm encoding into JSON . However the JSON always end up showing the float in scientific notation rater than decimal notation. I can fix this by changing the JSON to be a string rather than a number and using float.Text('f'), however I would really prefer to keep the type as a number.
I was a taking a look at float.Format but I don't believe this is suitable.
A really condensed gist of what I'm doing is below. I do a lot more modification of the value of supply before encoding it to json.
type TokenSupply struct {
TotalSupply *big.Float `json:"totalSupply, omitempty"`
}
supply := Float.NewFloat(1000000)
json.NewEncoder(w).Encode(TokenSupply{supply})
This returns:
{"totalSupply":"1e+06"}
big.Float is marshaled to string when converted to a JSON type
https://golang.org/pkg/encoding/json/#Marshal
Marshal traverses the value v recursively. If an encountered value implements the Marshaler interface and is not a nil pointer, Marshal calls its MarshalJSON method to produce JSON. If no MarshalJSON method is present but the value implements encoding.TextMarshaler instead, Marshal calls its MarshalText method and encodes the result as a JSON string. The nil pointer exception is not strictly necessary but mimics a similar, necessary exception in the behavior of UnmarshalJSON.
https://golang.org/pkg/math/big/#Float.MarshalText
func (x *Float) MarshalText() (text []byte, err error)
What can you do about it?
since your float may be more than 64 bits it won't play well with other languages that have to read the JSON value as a number. I'd suggest you keep the number as a string.
Caveats about encoding numbers that don't fit into 64 bits aside, here is how you could marshal a big.Float as a JSON number by wrapping it in a custom type that implements json.Marshaler. The key is that you can implement theMarshalJSON method anyway you like, as long as it emits valid JSON:
package main
import (
"encoding/json"
"fmt"
"math/big"
)
type BigFloatNumberJSON struct{ *big.Float }
func (bfn BigFloatNumberJSON) MarshalJSON() ([]byte, error) {
// Use big.Float.String() or any other string converter
// that emits a valid JSON number here...
return []byte(bfn.String()), nil
}
func main() {
totalSupply := new(big.Float).SetFloat64(1000000)
obj := map[string]interface{}{
"totalSupply": BigFloatNumberJSON{totalSupply},
}
bytes, err := json.Marshal(&obj)
if err != nil {
panic(err)
}
fmt.Println(string(bytes))
// => {"totalSupply":1000000}
}

How to convert utf8 string to []byte?

I want to unmarshal a string that contains JSON,
however the Unmarshal function takes a []byte as input.
How can I convert my UTF8 string to []byte?
This question is a possible duplicate of How to assign string to bytes array, but still answering it as there is a better, alternative solution:
Converting from string to []byte is allowed by the spec, using a simple conversion:
Conversions to and from a string type
[...]
Converting a value of a string type to a slice of bytes type yields a slice whose successive elements are the bytes of the string.
So you can simply do:
s := "some text"
b := []byte(s) // b is of type []byte
However, the string => []byte conversion makes a copy of the string content (it has to, as strings are immutable while []byte values are not), and in case of large strings it's not efficient. Instead, you can create an io.Reader using strings.NewReader() which will read from the passed string without making a copy of it. And you can pass this io.Reader to json.NewDecoder() and unmarshal using the Decoder.Decode() method:
s := `{"somekey":"somevalue"}`
var result interface{}
err := json.NewDecoder(strings.NewReader(s)).Decode(&result)
fmt.Println(result, err)
Output (try it on the Go Playground):
map[somekey:somevalue] <nil>
Note: calling strings.NewReader() and json.NewDecoder() does have some overhead, so if you're working with small JSON texts, you can safely convert it to []byte and use json.Unmarshal(), it won't be slower:
s := `{"somekey":"somevalue"}`
var result interface{}
err := json.Unmarshal([]byte(s), &result)
fmt.Println(result, err)
Output is the same. Try this on the Go Playground.
Note: if you're getting your JSON input string by reading some io.Reader (e.g. a file or a network connection), you can directly pass that io.Reader to json.NewDecoder(), without having to read the content from it first.
just use []byte(s) on the string. for example:
package main
import (
"encoding/json"
"fmt"
)
func main() {
s := `{"test":"ok"}`
var data map[string]interface{}
if err := json.Unmarshal([]byte(s), &data); err != nil {
panic(err)
}
fmt.Printf("json data: %v", data)
}
check it out on the playground here.