How do I substitute bytes in a stream using Go standard library? - json

I have a io.Reader which I get from http.Request.Body that reads a JSON byte slice from a server.
I would like to stream this to json.NewDecoder. However I would also like to intercept the JSON before it hits json.NewDecoder and substitute certain parts of it. For example, the JSON string contains empty hashes "{}" which I would like to remove due to a bug in the server's JSON output.
I am currently achieving my goal using json.Unmarshal but not using the JSON streaming parser:
data, _ := ioutil.ReadAll(r.Body)
data = bytes.Replace(data, []byte("{}"), "", -1)
json.Unmarshal(data, [my struct])
How can I achieve the same thing as above but using json.NewDecoder so I can save the many times the above code has to parse through r.Body's data? Here's some code using a pseudo function ReplaceStream(r io.Reader, old, new []byte):
reader := ReplaceStream(r.Body, []byte("{}"), "")
dec := json.NewDecoder(reader)
dec.Decode([my struct])
I know ReplaceStream might be fairly trivial to make, but is there anything in the standard library to do this that I am unaware of?

My advice is to just treat that kind of message as a special case and avoid the extra parsing / substituting for all the other requests
data, _ := ioutil.ReadAll(r.Body)
// FIXME: overcome bug #12312 of json server
if data == `{"list": [{}]}` {
return []
}
// Normal datastruct ..

Related

Wrapping json member fields to object

My objective is to add fields to json on user request.
Everything is great, but when displaying the fields with
fmt.Printf("%s: %s\n", content.Date, content.Description)
an error occurs:
invalid character '{' after top-level value
And that is because after adding new fields the file looks like this:
{"Date":"2017-03-20 10:46:48","Description":"new"}
{"Date":"2017-03-20 10:46:51","Description":"new .go"}
The biggest problem is with the writting to file
reminder := &Name{dateString[:19], text} //text - input string
newReminder, _ := json.Marshal(&reminder)
I dont really know how to do this properly
My question is how should I wrap all member fields into one object?
And what is the best way to iterate through member fields?
The code is available here: https://play.golang.org/p/NunV_B6sud
You should store the reminders into an array inside the json file, as mentioned by #Gerben Jacobs, and then, every time you want to add a new reminder to the array you need to read the full contents of rem.json, append the new reminder in Go, truncate the file, and write the new slice into the file. Here's a quick implentation https://play.golang.org/p/UKR91maQF2.
If you have lots of reminders and the process of reading, decoding, encoding, and writing the whole content becomes a pain you could open the file, implement a way to truncate only the last ] from the file contents, and then write only , + new reminder + ].
So after some research, people in the go-nuts group helped me and suggested me to use a streaming json parser that parses items individually.
So I needed to change my reminder listing function:
func listReminders() error {
f, err := os.Open("rem.json")
if err != nil {
return err
}
dec := json.NewDecoder(f)
for {
var content Name
switch dec.Decode(&content) {
case nil:
fmt.Printf("%#v\n", content)
case io.EOF:
return nil
default:
return err
}
}
}
Now everything works the way I wanted.

Most efficient way to convert io.ReadCloser to byte array

I have a very simple Go webserver. It's job is to receive an inbound json payload. It then publishes the payload to one or more services that expect a byte array. The payload doesn't need to be checked. Just sent over.
In this case, it receives an inbound job and sends it to Google PubSub. It might be another service - it doesn't really matter. I'm trying to find the most efficient way to convert the object to a byte array without first decoding it.
Why? Seems a bit wasteful to decode and convert to JSON on one server, only to unmarshal it later. Plus, I don't want to maintain two identical structs in two packages.
How is it possible to convert the io.ReadCloser to a byte array so I only need to unmarshal once. I tried something like this answer but don't think that's the most efficient way either:
From io.Reader to string in Go
My http server code looks like this:
func Collect(d DbManager) http.HandlerFunc {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json; charset=utf-8")
code := 422
obj := Report{}
response := Response{}
response.Message = "Invalid request"
decoder := json.NewDecoder(r.Body)
decoder.Decode(&obj)
if obj.Device.MachineType != "" {
msg,_ := json.Marshal(obj)
if d.Publish(msg, *Topic) {
code = 200
}
response.Message = "Ok"
}
a, _ := json.Marshal(response)
w.WriteHeader(code)
w.Write(a)
return
})
}
You convert a Reader to bytes, by reading it. There's not really a more efficient way to do it.
body, err := ioutil.ReadAll(r.Body)
If you are unconditionally transferring bytes from an io.Reader to an io.Writer, you can just use io.Copy

Why does json.Encoder add an extra line?

json.Encoder seems to behave slightly different than json.Marshal. Specifically it adds a new line at the end of the encoded value. Any idea why is that? It looks like a bug to me.
package main
import "fmt"
import "encoding/json"
import "bytes"
func main() {
var v string
v = "hello"
buf := bytes.NewBuffer(nil)
json.NewEncoder(buf).Encode(v)
b, _ := json.Marshal(&v)
fmt.Printf("%q, %q", buf.Bytes(), b)
}
This outputs
"\"hello\"\n", "\"hello\""
Try it in the Playground
Because they explicitly added a new line character when using Encoder.Encode. Here's the source code to that func, and it actually states it adds a newline character in the documentation (see comment, which is the documentation):
https://golang.org/src/encoding/json/stream.go?s=4272:4319
// Encode writes the JSON encoding of v to the stream,
// followed by a newline character.
//
// See the documentation for Marshal for details about the
// conversion of Go values to JSON.
func (enc *Encoder) Encode(v interface{}) error {
if enc.err != nil {
return enc.err
}
e := newEncodeState()
err := e.marshal(v)
if err != nil {
return err
}
// Terminate each value with a newline.
// This makes the output look a little nicer
// when debugging, and some kind of space
// is required if the encoded value was a number,
// so that the reader knows there aren't more
// digits coming.
e.WriteByte('\n')
if _, err = enc.w.Write(e.Bytes()); err != nil {
enc.err = err
}
encodeStatePool.Put(e)
return err
}
Now, why did the Go developers do it other than "makes the output look a little nice"? One answer:
Streaming
The go json Encoder is optimized for streaming (e.g. MB/GB/PB of json data). It is typical that when streaming you need a way to deliminate when your stream has completed. In the case of Encoder.Encode(), that is a \n newline character. Sure, you can certainly write to a buffer. But you can also write to an io.Writer which would stream the block of v.
This is opposed to the use of json.Marshal which is generally discouraged if your input is from an untrusted (and unknown limited) source (e.g. an ajax POST method to your web service - what if someone posts a 100MB json file?). And, json.Marshal would be a final complete set of json - e.g. you wouldn't expect to concatenate a few 100 Marshal entries together. You'd use Encoder.Encode() for that to build a large set and write to the buffer, stream, file, io.Writer, etc.
Whenever in doubt if it's a bug, I always lookup the source - that's one of the advantages to Go, it's source and compiler is just pure Go. Within [n]vim I use \gb to open the source definition in a browser with my .vimrc settings.
You can erease the newline by backward stream:
f, _ := os.OpenFile(fname, ...)
encoder := json.NewEncoder(f)
encoder.Encode(v)
f.Seek(-1, 1)
f.WriteString("other data ...")
They should let user control this strange behavior:
a build option to disable it
Encoder.SetEOF(eof string)
Encoder.SetIndent(prefix, indent, eof string)
The Encoder writes a stream of documents. The extra whitespace terminates a JSON document in the stream.
A terminator is required for stream readers. Consider a stream containing these JSON documents: 1, 2, 3. Without the extra whitespace, the data on the wire is the sequence of bytes 123. This is a single JSON document with the number 123, not three documents.

Golang pass JSON object to a function

This is regarded the Selenium Web driver but I think it is not quite important.
I can set the browser name
caps := selenium.Capabilities{"browserName": "firefox"}
wd, _ := selenium.NewRemote(caps, "")
But for "proxy" ie:
caps := selenium.Capabilities{"proxy": "http://1.2.3.4:999"}
wd, _ := selenium.NewRemote(caps, "")
I have to pass a JSON Proxy Object which I absolutely have no idea how to create... I searched there and there, but still could not figure... Is it kind of struct? Or map.. or what... :-(
As I've said in the comment, you can use the form
selenium.Capabilities{
"proxy": map[string]interface{}{
"httpProxy": "http://1.2.3.4:999",
// etc.
}
}
Unstructured JSON is usually (un)marshalled through map[string]interface{}, and the type selenium.Capabilities is in fact just a map[string]interface{}.
See also: JSON and Go.

Sending a MongoDB query to a different system: converting to JSON and then decoding into BSON? How to do it in Go language?

I need to transfer a MongoDB query to a different system. For this reason I would like to use the MongoDB Extended JSON. I need this to be done mostly because I use date comparisons in my queries.
So, the kernel of the problem is that I need to transfer a MongoDB query that has been generated in a node.js back-end to another back-end written in Go language.
Intuitively, the most obvious format for sending this query via REST, is JSON. But, MongoDB queries are not exactly JSON, but BSON, which contains special constructs for dates.
So, the idea is to convert the queries into JSON using MongoDB Extended JSON as form of representation of the special constructs. After some tests it's clear that these queries do not work. Both the MongoDB shell and queries sent via node.js's need the special ISODate or new Date constructs.
Finally, the actual question: are there functions to encode/decode from JSON to BSON, taking into account MongoDB Extended JSON, both in JavaScript (node.js) and Go language?
Updates
Node.js encoding package
Apparently there is a node.js package that parses and stringifies BSON/JSON.
So, half of my problem is resolved. I wonder if there is something like this in Go language.
Sample query
For example, the following query is in normal BSON:
{ Tmin: { $gt: ISODate("2006-01-01T23:00:00.000Z") } }
Translated into MongoDB Extended JSON, it becomes:
{ "Tmin": { "$gt" : { "$date" : 1136156400000 }}}
After some research I found the mejson library, however it's for Marshaling only, so I decided to write an Unmarshaller.
Behold ejson (I wrote it), right now it's a very simple ejson -> bson converter, there's no bson -> ejson yet, you can use mejson for that.
An example:
const j = `{"_id":{"$oid":"53c2ab5e4291b17b666d742a"},"last_seen_at":{"$date":1405266782008},"display_name":{"$undefined":true},
"ref":{"$ref":"col2", "$id":"53c2ab5e4291b17b666d742b"}}`
type TestS struct {
Id bson.ObjectId `bson:"_id"`
LastSeenAt *time.Time `bson:"last_seen_at"`
DisplayName *string `bson:"display_name,omitempty"`
Ref mgo.DBRef `bson:"ref"`
}
func main() {
var ts TestS
if err := ejson.Unmarshal([]byte(j), &ts); err != nil {
panic(err)
}
fmt.Printf("%+v\n", ts)
//or to convert the ejson to bson.M
var m map[string]interface{}
if err := json.Unmarshal([]byte(j), &m); err != nil {
t.Fatal(err)
}
err := ejson.Normalize(m)
if err != nil {
panic(err)
}
fmt.Printf("%+v\n", m)
}