How to convert utf8 string to []byte? - json

I want to unmarshal a string that contains JSON,
however the Unmarshal function takes a []byte as input.
How can I convert my UTF8 string to []byte?

This question is a possible duplicate of How to assign string to bytes array, but still answering it as there is a better, alternative solution:
Converting from string to []byte is allowed by the spec, using a simple conversion:
Conversions to and from a string type
[...]
Converting a value of a string type to a slice of bytes type yields a slice whose successive elements are the bytes of the string.
So you can simply do:
s := "some text"
b := []byte(s) // b is of type []byte
However, the string => []byte conversion makes a copy of the string content (it has to, as strings are immutable while []byte values are not), and in case of large strings it's not efficient. Instead, you can create an io.Reader using strings.NewReader() which will read from the passed string without making a copy of it. And you can pass this io.Reader to json.NewDecoder() and unmarshal using the Decoder.Decode() method:
s := `{"somekey":"somevalue"}`
var result interface{}
err := json.NewDecoder(strings.NewReader(s)).Decode(&result)
fmt.Println(result, err)
Output (try it on the Go Playground):
map[somekey:somevalue] <nil>
Note: calling strings.NewReader() and json.NewDecoder() does have some overhead, so if you're working with small JSON texts, you can safely convert it to []byte and use json.Unmarshal(), it won't be slower:
s := `{"somekey":"somevalue"}`
var result interface{}
err := json.Unmarshal([]byte(s), &result)
fmt.Println(result, err)
Output is the same. Try this on the Go Playground.
Note: if you're getting your JSON input string by reading some io.Reader (e.g. a file or a network connection), you can directly pass that io.Reader to json.NewDecoder(), without having to read the content from it first.

just use []byte(s) on the string. for example:
package main
import (
"encoding/json"
"fmt"
)
func main() {
s := `{"test":"ok"}`
var data map[string]interface{}
if err := json.Unmarshal([]byte(s), &data); err != nil {
panic(err)
}
fmt.Printf("json data: %v", data)
}
check it out on the playground here.

Related

strconv.Unquote does not remove escaped json characters, unmarshall fails in Go

I am parsing rather simple json where the slash in the date format string is escaped when it arrives in response.
however, when you try to unmarshall the string, it fails on "invalid syntax" error.
So I googled and we should use strconv.Unquote to replace the escaped characters first. Did that, and now the unquote function fails on the "unknown escape" error. However JSON RFC 8259 shows it is a valid case. So why valid JSON is causing failures in Go unmarshaller?
This is the simple JSON
{
"created_at": "6\/30\/2022 21:51:49"
}
var dst string
err := json.Unmarshal([]byte("6\/30\/2022 21:51:49"), dst) // this fails on ivnalid quote
fmt.Println(dst, err)
s, err := strconv.Unquote(`"6\/30\/2022 21:51:49"`) // this fails on ivnalid syntax
fmt.Println(s, err)
How does one proceed with such cases when the JSON is valid, but none of the built-in functions of Go actually parse it? What is the right way, apart from writing simple find and replace pattern in json []byte manually?
EDIT:
ok I think I should have asked a better question or provided a specific example from the scenario rather than just a portion of it. Garbage in garbage out, my fault.
Here is the thing. Let's say we have the specific, non standard time format like in the above json example.
{
"created_at": "6\/30\/2022 21:51:49"
}
I need to parse it into the struct with go time type. Because the standard parser for time would fail for that format I create a custom type.
type Request struct {
CreatedAt Time `json:"created_at"`
}
type Time time.Time
unc (d *Time) UnmarshalJSON(b []byte) error {
s := strings.Trim(string(b), "\"")
parse, err := time.Parse("1/2/2006 15:04:05", s) // this fails because of the escaped backslashes
if err != nil {
return err
}
*d = Time(parse)
return nil
}
Here is an example in Playground: https://go.dev/play/p/bE3AWQeV-ug
panic: parsing time "6\\/30\\/2022 21:51:49" as "1/2/2006 15:04:05": cannot parse "\\/30\\/2022 21:51:49" as "/"
This is the reason I am trying to put the Unquote into the custom unmarshal function, but that is not the right approach as it was already pointed out. How does the default string json unmarshaller for example removes those escaped slashes? Oh and I have no control about the side that writes the json, it comes in with single escaped slashes like that in []byte from *http.Response
Go has a great built-in function to convert a JSON string to a Go string: json.Unmarshal. Here is how you can integrate it with a custom UnmarshalJSON method:
func (d *Time) UnmarshalJSON(b []byte) error {
var s string
if err := json.Unmarshal(b, &s); err != nil {
return err
}
// Remove this line: s := strings.Trim(string(b), "\"")
// The rest of the code is unchanged
Unquote is designed for cases when you have a double quoted string like "\"My name\"", you don't really need that in this case.
The problem with that code is that you're trying to unmarshal a byte slice that isn't JSON. You're also unmarshaling into an string which isn't going to work. You need to unmarshal into a struct that has json tags or unmarshal into a map.
var dest map[string]string
someJsonString := `{
"created_at": "6\/30\/2022 21:51:49"
}`
err := json.Unmarshal([]byte(someJsonString), &dest)
if err != nil {
panic(err)
}
fmt.Println(dest)
if you wanted to do Unmarshal into a struct, you can do it like so
type SomeStruct struct {
CreatedAt string `json:"created_at"`
}
someJsonString := `{
"created_at": "6\/30\/2022 21:51:49"
}`
var data SomeStruct
err := json.Unmarshal([]byte(someJsonString), &data)
if err != nil {
panic(err)
}
fmt.Println(data)

Go: decoding json with one set of json tags, and encoding to a different set of json tags

I have an application that consumes data from a third-party api. I need to decode the json into a struct, which requires the struct to have json tags of the "incoming" json fields. The outgoing json fields have a different naming convention, so I need different json tags for the encoding.
I will have to do this with many different structs, and each struct might have many fields.
What is the best way to accomplish this without repeating a lot of code?
Example Structs:
// incoming "schema" field names
type AccountIn struct {
OpenDate string `json:"accountStartDate"`
CloseDate string `json:"cancelDate"`
}
// outgoing "schema" field names
type AccountOut struct {
OpenDate string `json:"openDate"`
CloseDate string `json:"closeDate"`
}
Maybe the coming change on Go 1.8 would help you, it will allow to 'cast' types even if its JSON tags definition is different: This https://play.golang.org/p/Xbsoa8SsEk works as expected on 1.8beta, I guess this would simplify your current solution
A bit an uncommon but probably quite well working method would be to use a intermediate format so u can use different readers and writers and therefore different tags. For example https://github.com/mitchellh/mapstructure which allows to convert a nested map structure into struct
types. Pretty similar like json unmarshal, just from a map.
// incoming "schema" field names
type AccountIn struct {
OpenDate string `mapstructure:"accountStartDate" json:"openDate"`
CloseDate string `mapstructure:"cancelDate" json:"closeDate"`
}
// from json to map with no name changes
temporaryMap := map[string]interface{}{}
err := json.Unmarshal(jsonBlob, &temporaryMap)
// from map to structs using mapstructure tags
accountIn := &AccountIn{}
mapstructure.Decode(temporaryMap, accountIn)
Later when writing (or reading) u will use directly the json functions which will then use the json tags.
If it's acceptable to take another round trip through json.Unmarshal and json.Marshal, and you don't have any ambiguous field names within your various types, you could translate all the json keys in one pass by unmarshaling into the generic structures used by the json package:
// map incoming to outgoing json identifiers
var translation = map[string]string{
"accountStartDate": "openDate",
"cancelDate": "closeDate",
}
func translateJS(js []byte) ([]byte, error) {
var m map[string]interface{}
if err := json.Unmarshal(js, &m); err != nil {
return nil, err
}
translateKeys(m)
return json.MarshalIndent(m, "", " ")
}
func translateKeys(m map[string]interface{}) {
for _, v := range m {
if v, ok := v.(map[string]interface{}); ok {
translateKeys(v)
}
}
keys := make([]string, 0, len(m))
for k := range m {
keys = append(keys, k)
}
for _, k := range keys {
if newKey, ok := translation[k]; ok {
m[newKey] = m[k]
delete(m, k)
}
}
}
https://play.golang.org/p/nXmWlj7qH9
This might be a Naive Approach but is fairly easy to implement:-
func ConvertAccountInToAccountOut(AccountIn incoming) (AccountOut outcoming){
var outcoming AccountOut
outcoming.OpenDate = incoming.OpenDate
outcoming.CloseDate = incoming.CloseDate
return outcoming
}
var IncomingJSONData AccountIn
resp := getJSONDataFromSource() // Some method that gives you the Input JSON
err1 := json.UnMarshall(resp,&IncomingJSONData)
OutGoingJSONData := ConvertAccountInToAccountOut(IncomingJSONData)
if err1 != nil {
fmt.Println("Error in UnMarshalling JSON ",err1)
}
fmt.Println("Outgoing JSON Data: ",OutGoingJSONData)

Why are string and []bytes treated differently when unmarshaling JSON?

My understanding from reading the documentation was that string is essentially an immutable []byte and that one can easily convert between the two.
However when unmarshaling from JSON this doesn't seem to be true. Take the following example program:
package main
import (
"encoding/json"
"fmt"
)
type STHRaw struct {
Hash []byte `json:"hash"`
}
type STHString struct {
Hash string `json:"hash"`
}
func main() {
bytes := []byte(`{"hash": "nuyHN9wx4lZL2L3Ir3dhZpmggTQEIHEZcC3DUNCtQsk="}`)
stringHead := new(STHString)
if err := json.Unmarshal(bytes, &stringHead); err != nil {
return
}
rawHead := new(STHRaw)
if err := json.Unmarshal(bytes, &rawHead); err != nil {
return
}
fmt.Printf("String:\t\t%x\n", stringHead.Hash)
fmt.Printf("Raw:\t\t%x\n", rawHead.Hash)
fmt.Printf("Raw to string:\t%x\n", string(rawHead.Hash[:]))
}
This gives the following output:
String: 6e7579484e397778346c5a4c324c3349723364685a706d67675451454948455a63433344554e437451736b3d
Raw: 9eec8737dc31e2564bd8bdc8af77616699a0813404207119702dc350d0ad42c9
Raw to string: 9eec8737dc31e2564bd8bdc8af77616699a0813404207119702dc350d0ad42c9
Instead I would have expected to receive the same value each time.
What is the difference?
The designers of the encoding/json package made the decision that applications must provide valid UTF-8 text in string values and that applications can put arbitrary byte sequences in []byte values. The package base64 encodes []byte values to ensure that the resulting string is valid UTF-8.
The encoding of []byte values is described in the Marshal function documentation.
This decision was not dictated by the design of the Go language. The string type can contain arbitrary byte sequences. The []byte type can contain valid UTF-8 text.
The designers could have used a flag in the field tag to indicate that a string or []byte value should be encoded and which encoder to use, but that's not what they did.

Catchall type for Golang API response

I am trying to define a struct that can hold an array of any type like so:
type APIResonse struct {
length int
data []interface{}
}
I want the data property to be capable of holding an array of any type/struct so I can have a single response type, that will eventually be serialized to json. So what I want to be able to write is something like the following:
someStruct := getSomeStructArray()
res := &APIResponse{
length: len(someStruct),
data: someStruct,
}
enc, err := json.Marshal(res)
Is this possible in Go? I keep getting cannot use cs (type SomeType) as type []interface {} in assignment. Or do I have to create a different response type for every variation of data? Or maybe I am going about this wrong entirely / not Go-like. Any help would be much appreciated!
There are a couple of problems with that code.
You need to use interface{}, not []interface{}, also [] is called a slice, an array is a fixed number of elements like [10]string.
And your APIResponse fields aren't exported, so json.Marshal will not print out anything.
func main() {
d := []dummy{{100}, {200}}
res := &APIResponse{
Length: len(d),
Data: d,
}
enc, err := json.Marshal(res)
fmt.Println(string(enc), err)
}
playground

How to unmarshal an escaped JSON string

I am using Sockjs with Go, but when the JavaScript client send json to the server it escapes it, and send's it as a []byte. I'm trying to figure out how to parse the json, so that i can read the data. but I get this error.
json: cannot unmarshal string into Go value of type main.Msg
How can I fix this? html.UnescapeString() has no effect.
val, err := session.ReadMessage()
if err != nil {
break
}
var msg Msg
err = json.Unmarshal(val, &msg)
fmt.Printf("%v", val)
fmt.Printf("%v", err)
type Msg struct {
Channel string
Name string
Msg string
}
//Output
"{\"channel\":\"buu\",\"name\":\"john\", \"msg\":\"doe\"}"
json: cannot unmarshal string into Go value of type main.Msg
You might want to use strconv.Unquote on your JSON string first :)
Here's an example, kindly provided by #gregghz:
package main
import (
"encoding/json"
"fmt"
"strconv"
)
type Msg struct {
Channel string
Name string
Msg string
}
func main() {
var msg Msg
var val []byte = []byte(`"{\"channel\":\"buu\",\"name\":\"john\", \"msg\":\"doe\"}"`)
s, _ := strconv.Unquote(string(val))
err := json.Unmarshal([]byte(s), &msg)
fmt.Println(s)
fmt.Println(err)
fmt.Println(msg.Channel, msg.Name, msg.Msg)
}
You need to fix this in the code that is generating the JSON.
When it turns out formatted like that, it is being JSON encoded twice. Fix that code that is doing the generating so that it only happens once.
Here's some JavaScript that shows what's going on.
// Start with an object
var object = {"channel":"buu","name":"john", "msg":"doe"};
// serialize once
var json = JSON.stringify(object); // {"channel":"buu","name":"john","msg":"doe"}
// serialize twice
json = JSON.stringify(json); // "{\"channel\":\"buu\",\"name\":\"john\",\"msg\":\"doe\"}"
Sometimes, strconv.Unquote doesn't work.
Heres an example shows the problem and my solution.
(The playground link: https://play.golang.org/p/Ap0cdBgiA05)
Thanks for #Crazy Train's "encodes twice" idea, I just decoded it twice ...
package main
import (
"encoding/json"
"fmt"
"strconv"
)
type Wrapper struct {
Data string
}
type Msg struct {
Photo string
}
func main() {
var wrapper Wrapper
var original = `"{\"photo\":\"https:\/\/www.abc.net\/v\/t1.0-1\/p320x320\/123.jpg\"}"`
_, err := strconv.Unquote(original)
fmt.Println(err)
var val []byte = []byte("{\"data\":"+original+"}")
fmt.Println(string(val))
err = json.Unmarshal([]byte(val), &wrapper)
fmt.Println(wrapper.Data)
var msg Msg
err = json.Unmarshal([]byte(wrapper.Data), &msg)
fmt.Println(msg.Photo)
}
As Crazy Train pointed out, it appears that your input is doubly escaped, thus causing the issue. One way to fix this is to make sure that the function session.ReadMessasge() returns proper output that is escaped appropriately. However, if that's not possible, you can always do what x3ro suggested and use the golang function strconv.Unquote.
Here's a playground example of it in action:
http://play.golang.org/p/GTishI0rwe
The data shown in the problem is stringified for some purposes, in some cases you can even have \n in your string representing break of line in your json.
Let's understand the easiest way to unmarshal/deserialize this kind of data using the following example:
Next line shows the data you get from your sources and want to derserialize
stringifiedData := "{\r\n \"a\": \"b\",\r\n \"c\": \"d\"\r\n}"
Now, remove all new lines first
stringifiedData = strings.ReplaceAll(stringifiedData, "\n", "")
Then remove all the extra quotes that are present in your string
stringifiedData = strings.ReplaceAll(stringifiedData, "\\"", "\"")
Let's now convert the string into a byte array
dataInBytes := []byte(stringifiedData)
Before doing unmarshal, let's define structure of our data
jsonObject := &struct{
A string `json:"a"`
C string `json:"c"`
}
Now, you can dersialize your values into jsonObject
json.Unmarshal(dataInBytes, jsonObject)}
You got into infamous escaped string pitfall from JavaScript. Quite often people face (as I did) the same issue in Go, when serializing JSON string with json.Marshal, e.g.:
in := `{"firstName":"John","lastName":"Dow"}`
bytes, err := json.Marshal(in)
json.Marshal escapes double quotes, producing the same issue when you try to unmarshal bytes into struct.
If you faced the issue in Go, have a look at How To Correctly Serialize JSON String In Golang post which describes the issue in details with solution to it.