Parsing JSON concurrently - panic of runtime error (decoding related) - json

I was playing with go recently and stuck with a runtime error, I can't explain. These are my working functions.
type User struct {
Browsers []string `json:"browsers"`
Name string `json:"name"`
Email string `json:"email"`
}
func asyncUserProcJson(wg *sync.WaitGroup, users *[]User, ch chan []byte) {
for buf := range ch {
var mu sync.Mutex
var user User
mu.Lock()
err := json.Unmarshal(buf, &user)
mu.Unlock()
if err != nil {
fmt.Println("json:", err)
wg.Done()
continue
}
*users = append(*users, user)
wg.Done()
}
}
func userProcJson(buf []byte) (User, error) {
var user User
err := json.Unmarshal(buf, &user)
if err != nil {
return User{}, err
}
return user, nil
}
If I do a common - non-concurrent aproach, its works as expected. But if, try to use channel to pass bytes to goroutine... it fails.
type AsyncUserProc func(*sync.WaitGroup, *[]User, chan []byte)
type UserProc func(buf []byte) (User, error)
type SearchParams struct {
out io.Writer
asyncUserProc AsyncUserProc
userProc UserProc
}
func (sp SearchParams) AsyncSearch() []User {
file, err := os.Open(filePath)
if err != nil {
log.Fatalln(err)
}
var Users = make([]User, 0, 1024)
var ch = make(chan []byte)
var wg sync.WaitGroup
go sp.asyncUserProcess(&wg, &Users, ch)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
wg.Add(1)
ch <- scanner.Bytes()
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading standard input:", err)
}
close(ch)
wg.Wait()
return Users
}
func (sp SearchParams) Search() []User {
file, err := os.Open(filePath)
if err != nil {
log.Fatalln(err)
}
// json processor
var Users = make([]User, 0, 1024)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
u, err := sp.userProcess(scanner.Bytes())
if err != nil {
log.Panicln(err)
continue
}
Users = append(Users, u)
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading standard input:", err)
}
return Users
}
Workflow is the next one:
filePath contains a JSON chunks (each on new line)
Open for reading.
Create a line scanner
(AsyncSearch)
Pass line to channel.
return value of the line from range (blocking operation)
pass to json.Unmarshal
troubles
(Search)
Pass line directly to userProc func
Enjoy result
I am getting a lot (different) errors.
a lot of json unmarshaling error.
index out of range
JSON decoder out of sync - data changing underfoot?
as description of last error:
// phasePanicMsg is used as a panic message when we end up with something that
// shouldn't happen. It can indicate a bug in the JSON decoder, or that
// something is editing the data slice while the decoder executes.
So here is a question: How the bytes slice is modified?
I thought it was blocking operation. What am I missing in language mechanics?
Example of the errors (different each run)
json: invalid character 'i' looking for beginning of value
json: invalid character ':' after top-level value
json: invalid character 'r' looking for beginning of value
panic: runtime error: index out of range
----
json: invalid character '.' after top-level value
json: invalid character 'K' looking for beginning of value
panic: JSON decoder out of sync - data changing underfoot?

Package bufio
import "bufio"
func (*Scanner) Bytes
func (s *Scanner) Bytes() []byte
Bytes returns the most recent token generated by a call to Scan. The
underlying array may point to data that will be overwritten by a
subsequent call to Scan. It does no allocation.
The underlying array may point to data that will be overwritten by a subsequent call to Scan.

Related

Go reading map from json stream

I need to parse really long json file (more than million items). I don't want to load it to the memory and read it chunk by chunk. There's a good example with the array of items here. The problem is that I deal with the map. And when I call Decode I get not at beginning of value.
I can't get what should be changed.
const data = `{
"object1": {"name": "cattle","location": "kitchen"},
"object2": {"name": "table","location": "office"}
}`
type ReadObject struct {
Name string `json:"name"`
Location string `json:"location"`
}
func ParseJSON() {
dec := json.NewDecoder(strings.NewReader(data))
tkn, err := dec.Token()
if err != nil {
log.Fatalf("failed to read opening token: %v", err)
}
fmt.Printf("opening token: %v\n", tkn)
objects := make(map[string]*ReadObject)
for dec.More() {
var nextSymbol string
if err := dec.Decode(&nextSymbol); err != nil {
log.Fatalf("failed to parse next symbol: %v", err)
}
nextObject := &ReadObject{}
if err := dec.Decode(&nextObject); err != nil {
log.Fatalf("failed to parse next object")
}
objects[nextSymbol] = nextObject
}
tkn, err = dec.Token()
if err != nil {
log.Fatalf("failed to read closing token: %v", err)
}
fmt.Printf("closing token: %v\n", tkn)
fmt.Printf("OBJECTS: \n%v\n", objects)
}
TL,DR: when you are calling Token() method for a first time, you move offset from the beginning (of a JSON value) and therefore you get the error.
You are working with this struct (link):
type Decoder struct {
// others fields omits for simplicity
tokenState int
}
Pay attention for a tokenState field. This value could be one of (link):
const (
tokenTopValue = iota
tokenArrayStart
tokenArrayValue
tokenArrayComma
tokenObjectStart
tokenObjectKey
tokenObjectColon
tokenObjectValue
tokenObjectComma
)
Let's back to your code. You are calling Token() method. This method obtains first JSON-valid token { and changes tokenState from tokenObjectValue to the tokenObjectStart (link). Now you are "in-an-object" state.
If you try to call Decode() at this point you will get an error (not at beginning of value). This is because allowed states of tokenState for calling Decode() are tokenTopValue, tokenArrayStart, tokenArrayValue, tokenObjectValue, i.e. "full" value, not part of it (link).
To avoid this you can just don't call Token() at all and do something like this:
dec := json.NewDecoder(strings.NewReader(dataMapFromJson))
objects := make(map[string]*ReadObject)
if err := dec.Decode(&objects); err != nil {
log.Fatalf("failed to parse next symbol: %v", err)
}
fmt.Printf("OBJECTS: \n%v\n", objects)
Or, if you want to read chunk-by-chunk, you could keep calling Token() until you reach "full" value. And then call Decode() on this value (I guess this should work).
After consuming the initial { with your first call to dec.Token(), you must :
use dec.Token() to extract the next key
after extracting the key, you can call dec.Decode(&nextObject) to decode an entry
example code :
for dec.More() {
key, err := dec.Token()
if err != nil {
// handle error
}
var val interface{}
err = dec.Decode(&val)
if err != nil {
// handle error
}
fmt.Printf(" %s : %v\n", key, val)
}
https://play.golang.org/p/5r1d8MsNlKb

Chrome native messaging host in golang fails when JSON size is more than 65500 characters

I am trying to write a native messaging host for chrome in golang. For this purpose, I tried using chrome-go as well as chrome-native-messaging packages. Both presented with the same problem as explained below.
Here is the code. I have added the relevant parts from the chrome-go package to the main file instead of importing it for easy understanding.
The following code actually works when I send a json message to it like {content:"Apple Mango"}. However, it stops working once the length of the json goes over approximately 65500 characters, give or take a 100 characters. There is no error output either.
package main
import (
"encoding/binary"
"encoding/json"
"fmt"
"io"
"os"
)
var byteOrder binary.ByteOrder = binary.LittleEndian
func Receive(reader io.Reader) ([]byte, error) {
// Read message length in native byte order
var length uint32
if err := binary.Read(reader, byteOrder, &length); err != nil {
return nil, err
}
// Return if no message
if length == 0 {
return nil, nil
}
// Read message body
received := make([]byte, length)
if n, err := reader.Read(received); err != nil || n != len(received) {
return nil, err
}
return received, nil
}
type response struct {
Content string `json:"content"`
}
func main() {
msg, err := Receive(os.Stdin)
if err != nil {
panic(err)
}
var res response
err = json.Unmarshal([]byte(msg), &res)
if err != nil {
panic(err)
}
fmt.Println(res.Content)
}
For those interested in testing, I have set up a repository with instructions. Run the following
git clone --depth=1 https://tesseract-index#bitbucket.org/tesseract-index/chrome-native-messaging-test-riz.git && cd chrome-native-messaging-test-riz
./json2msg.js < test-working.json | go run main.go
./json2msg.js < test-not-working.json | go run main.go
You will see that test-not-working.json gives no output, although its difference with test-working.json is a few hundred characters only.
What is the issue here?
There is a limitation of a pipe buffer which varies across systems. Mac OS X, for example, uses a capacity of 16384 bytes by default.
You can use this bash script to check your buffer capacity:
M=0; while printf A; do >&2 printf "\r$((++M)) B"; done | sleep 999
So it is not related to go, because I tried to change your code to read from file and Unmarshal and it worked:
func main() {
reader, err := os.Open("test-not-working.json")
if err != nil {
panic(err)
}
var res response
decoder := json.NewDecoder(reader)
err = decoder.Decode(&res)
if err != nil {
panic(err)
}
fmt.Println(res.Content)
}
This is because the pipe buffer of your OS is limited to 65536 bytes. Thus, the os.Stdin.Read(...) function can read 65536 bytes at once.
You can fix your code with this simple replacement:
n, err := io.ReadFull(reader, received)
And there is your error:
msg, err := Receive(os.Stdin)
if err != nil {
panic(err)
}
You have compared err with nil, but you have not compared msg with nil. But since you have read 65532 (65536 - 4) bytes, the func Receive(...) returned nil, nil.
To fix this, your function Receive(...) ought not return nil, nil.

send and read a [] byte between two microservices golang

I have a data encryption function that returns a [] byte. Of course, what has been encrypted must be decrypted (through another function) in another micro-service.
The problem is created when I send the []byte via JSON: the []byte is transformed into a string and then when I go to read the JSON through the call, the result is no longer the same.
I have to be able to pass the original []byte, created by the encryption function, through JSON or otherwise pass the []byte through a call like the one you can see below. Another possibility is to change the decryption function, but I have not succeeded.
caller function
func Dati_mono(c *gin.Context) {
id := c.Param("id")
oracle, err := http.Get("http://XXXX/"+id)
if err != nil {
panic(err)
}
defer oracle.Body.Close()
oJSON, err := ioutil.ReadAll(oracle.Body)
if err != nil {
panic(err)
}
oracleJSON := security.Decrypt(oJSON, keyEn)
c.JSON(http.StatusOK, string(oJSON))
}
function that is called with the url
func Dati(c *gin.Context) {
var (
person Person
result mapstring.Dati_Plus
mmap []map[string]interface{}
)
rows, err := db.DBConor.Query("SELECT COD_DIPENDENTE, MATRICOLA, COGNOME FROM ANDIP021_K")
if err != nil {
fmt.Print(err.Error())
}
for rows.Next() {
err = rows.Scan(&person.COD_DIPENDENTE, &person.MATRICOLA, &person.COGNOME)
ciao := structs.Map(&person)
mmap = append(mmap, ciao)
}
defer rows.Close()
result = mapstring.Dati_Plus{
len(mmap),
mmap,
}
jsonEn := []byte(mapstring.Dati_PlustoStr(result))
keyEn := []byte(key)
cipherjson, err := security.Encrypt(jsonEn, keyEn)
if err != nil {
log.Fatal(err)
}
c.JSON(http.StatusOK, cipherjson)
}
encryption and decryption functions
func Encrypt(json []byte, key []byte) (string, error) {
k, err := aes.NewCipher(key)
if err != nil {
return "nil", err
}
gcm, err := cipher.NewGCM(k)
if err != nil {
return "nil", err
}
nonce := make([]byte, gcm.NonceSize())
if _, err = io.ReadFull(rand.Reader, nonce); err != nil {
return "nil", err
}
return gcm.Seal(nonce, nonce, json, nil), nil
}
func Decrypt(cipherjson []byte, key []byte) ([]byte, error) {
k, err := aes.NewCipher(key)
if err != nil {
return nil, err
}
gcm, err := cipher.NewGCM(k)
if err != nil {
return nil, err
}
nonceSize := gcm.NonceSize()
if len(cipherjson) < nonceSize {
return nil, errors.New("cipherjson too short")
}
nonce, cipherjson := cipherjson[:nonceSize], cipherjson[nonceSize:]
return gcm.Open(nil, nonce, cipherjson, nil)
}
Everything works, the problem is created when I print cipherjson in c.JSON (): the []byte is translated into a string.
At the time it is taken and read by the calling function it is read as string and ioutil.ReadAll () creates the [] byte of the read string.
Instead I must be able to pass to the Decryot function the return of the Encrypt function used in the called function.
I hope I was clear, thanks in advance
You are not decoding the response before decrypting. In other words, you are handing the JSON encoding of the ciphertext to Decrypt. That is obviously not going to do what you want. To recover the plaintext you have to precisely undo all of the operations of the encryption and encoding in reverse order.
Either decode before decrypting, or don't JSON encode on the server. For instance:
oJSON, err := ioutil.ReadAll(oracle.Body)
if err != nil {
panic(err)
}
var ciphertext string
if err := json.Unmarshal(oJSON, &ciphertext); err != nil {
// TODO: handle error
}
oracleJSON := security.Decrypt(ciphertext, keyEn)
Although it is unclear why you even go through the trouble of JSON encoding in the first place. You might as well just write the ciphertext directly. If you really want to encode the ciphertext, you should not convert it to a string. The ciphertext is just a bunch of random bytes, not remotely resembling a UTF-8 encoded string, so don't treat it like one. encoding/json uses the base64 encoding for byte slices automatically, which is a much cleaner (and probably shorter) representation of the ciphertext than tons of unicode escape sequences.
Independent of the encoding you choose (if any), your Encrypt function is broken.
// The plaintext and dst must overlap exactly or not at all. To reuse
// plaintext's storage for the encrypted output, use plaintext[:0] as dst.
Seal(dst, nonce, plaintext, additionalData []byte) []byte
The first argument is the destination for the encryption. If you don't need to retain the plaintext, pass json[:0]; otherwise pass nil.
Also, Decrypt expects the ciphertext to be prefixed by the nonce, but Encrypt doesn't prepend it.

Custom marshalling to bson and JSON (Golang & mgo)

I have the following type in Golang:
type Base64Data []byte
In order to support unmarshalling a base64 encoded string to this type, I did the following:
func (b *Base64Data) UnmarshalJSON(data []byte) error {
if len(data) == 0 {
return nil
}
content, err := base64.StdEncoding.DecodeString(string(data[1 : len(data)-1]))
if err != nil {
return err
}
*b = []byte(xml)
return nil
}
Now I also want to be able to marshal and unmarshal it to mongo database, using mgo Golang library.
The problem is that I already have documents there stored as base64 encoded string, so I have to maintain that.
I tried to do the following:
func (b Base64Data) GetBSON() (interface{}, error) {
return base64.StdEncoding.EncodeToString([]byte(b)), nil
}
func (b *Base64DecodedXml) SetBSON(raw bson.Raw) error {
var s string
var err error
if err = raw.Unmarshal(&s); err != nil {
return err
}
*b, err = base64.StdEncoding.DecodeString(s)
return err
}
So that after unmarshaling, the data is already decoded, so I need to encode it back, and return it as a string so it will be written to db as a string (and vice versa)
For that I implemented bson getter and setter, but it seems only the getter is working properly
JSON unmarshaling from base64 encoded string works, as well marshaling it to database. but unmarshling setter seems to not be called at all.
Can anyone suggest what I'm missing, so that I'll be able to properly hold the data decoded in memory, but encoded string type?
This is a test I tried to run:
b := struct {
Value shared.Base64Data `json:"value" bson:"value"`
}{}
s := `{"value": "PHJvb3Q+aGVsbG88L3Jvb3Q+"}`
require.NoError(t, json.Unmarshal([]byte(s), &b))
t.Logf("%v", string(b.Value))
b4, err := bson.Marshal(b)
require.NoError(t, err)
t.Logf("%v", string(b4))
require.NoError(t, bson.Unmarshal(b4, &b))
t.Logf("%v", string(b.Value))
You can't marshal any value with bson.Marshal(), only maps and struct values.
If you want to test it, pass a map, e.g. bson.M to bson.Marshal():
var x = Base64Data{0x01, 0x02, 0x03}
dd, err := bson.Marshal(bson.M{"data": x})
fmt.Println(string(dd), err)
Your code works as-is, and as you intend it to. Try to insert a wrapper value to verify it:
c := sess.DB("testdb").C("testcoll")
var x = Base64Data{0x01, 0x02, 0x03}
if err := c.Insert(bson.M{
"data": x,
}); err != nil {
panic(err)
}
This will save the data as a string, being the Base64 encoded form.
Of course if you want to load it back into a value of type Base64Data, you will also need to define the SetBSON(raw Raw) error method too (bson.Setter interface).

Is there a simpler way to implement a JSON REST service with net/http?

I am trying to develop a REST service with net/http.
The service receives a JSON structure containing all the input parameters. I wonder if there is an easier and shorter way to implement the following:
func call(w http.ResponseWriter, r *http.Request) {
if err := r.ParseForm(); err != nil {
fmt.Printf("Error parsing request %s\n", err)
}
var buf []byte
buf = make([]byte, 256)
var n, err = r.Body.Read(buf)
var decoded map[string]interface{}
err = json.Unmarshal(buf[:n], &decoded)
if err != nil {
fmt.Printf("Error decoding json: %s\n", err)
}
var uid = decoded["uid"]
...
}
As you can see it requires quite a number of lines just to get to the extraction of the first parameter. Any ideas?
You don't need to call r.ParseForm if the body of the request will contain a JSON structure and you don't need any URL parameters.
You don't need the buffer either; you can use:
decoder := json.NewDecoder(r.Body)
And then:
error := decoder.Decode(decoded)
Putting it all together:
func call(w http.ResponseWriter, r *http.Request) {
values := make(map[string]interface{})
if error := json.NewDecoder(r.Body).Decode(&values); error != nil {
panic(error)
}
uid := values["uid"].(int)
}
It would be much nicer, though, if you could formally define the structure of the input that you're expecting in a struct type:
type UpdateUserInformationRequest struct {
UserId int `json:"uid"`
// other fields...
}
And use an instance of this struct instead of a more general map.