Why does json.Encoder add an extra line? - json

json.Encoder seems to behave slightly different than json.Marshal. Specifically it adds a new line at the end of the encoded value. Any idea why is that? It looks like a bug to me.
package main
import "fmt"
import "encoding/json"
import "bytes"
func main() {
var v string
v = "hello"
buf := bytes.NewBuffer(nil)
json.NewEncoder(buf).Encode(v)
b, _ := json.Marshal(&v)
fmt.Printf("%q, %q", buf.Bytes(), b)
}
This outputs
"\"hello\"\n", "\"hello\""
Try it in the Playground

Because they explicitly added a new line character when using Encoder.Encode. Here's the source code to that func, and it actually states it adds a newline character in the documentation (see comment, which is the documentation):
https://golang.org/src/encoding/json/stream.go?s=4272:4319
// Encode writes the JSON encoding of v to the stream,
// followed by a newline character.
//
// See the documentation for Marshal for details about the
// conversion of Go values to JSON.
func (enc *Encoder) Encode(v interface{}) error {
if enc.err != nil {
return enc.err
}
e := newEncodeState()
err := e.marshal(v)
if err != nil {
return err
}
// Terminate each value with a newline.
// This makes the output look a little nicer
// when debugging, and some kind of space
// is required if the encoded value was a number,
// so that the reader knows there aren't more
// digits coming.
e.WriteByte('\n')
if _, err = enc.w.Write(e.Bytes()); err != nil {
enc.err = err
}
encodeStatePool.Put(e)
return err
}
Now, why did the Go developers do it other than "makes the output look a little nice"? One answer:
Streaming
The go json Encoder is optimized for streaming (e.g. MB/GB/PB of json data). It is typical that when streaming you need a way to deliminate when your stream has completed. In the case of Encoder.Encode(), that is a \n newline character. Sure, you can certainly write to a buffer. But you can also write to an io.Writer which would stream the block of v.
This is opposed to the use of json.Marshal which is generally discouraged if your input is from an untrusted (and unknown limited) source (e.g. an ajax POST method to your web service - what if someone posts a 100MB json file?). And, json.Marshal would be a final complete set of json - e.g. you wouldn't expect to concatenate a few 100 Marshal entries together. You'd use Encoder.Encode() for that to build a large set and write to the buffer, stream, file, io.Writer, etc.
Whenever in doubt if it's a bug, I always lookup the source - that's one of the advantages to Go, it's source and compiler is just pure Go. Within [n]vim I use \gb to open the source definition in a browser with my .vimrc settings.

You can erease the newline by backward stream:
f, _ := os.OpenFile(fname, ...)
encoder := json.NewEncoder(f)
encoder.Encode(v)
f.Seek(-1, 1)
f.WriteString("other data ...")
They should let user control this strange behavior:
a build option to disable it
Encoder.SetEOF(eof string)
Encoder.SetIndent(prefix, indent, eof string)

The Encoder writes a stream of documents. The extra whitespace terminates a JSON document in the stream.
A terminator is required for stream readers. Consider a stream containing these JSON documents: 1, 2, 3. Without the extra whitespace, the data on the wire is the sequence of bytes 123. This is a single JSON document with the number 123, not three documents.

Related

How we can read a JSON file with Go Programming Language?

I'm working on a translation project on my Angular app. I already create all the different keys for that. I try now to use Go Programming Language to add some functionalities in my translation, to work quickly after.
I try to code a function in Go Programming Language in order to read an input user on the command line. I need to read this input file in order to know if there is missing key inside. This input user must be a JSON file. I have a problem with this function, is blocked at functions.Check(err), in order to debug my function I displayed the different variable with fmt.Printf(variable to display).
I call this function readInput() in my main function.
The readInput() function is the following :
// this function is used to read the user's input on the command line
func readInput() string {
// we create a reader
reader := bufio.NewReader(os.Stdin)
// we read the user's input
answer, err := reader.ReadString('\n')
// we check if any errors have occured while reading
functions.Check(err)
// we trim the "\n" from the answer to only keep the string input by the user
answer = strings.Trim(answer, "\n")
return answer
}
In my main function I call readInput() for a specific command I created. This command line is usefull to update a JSON file and add a missing key automatically.
My func main is :
func main() {
if os.Args[1] == "update-json-from-json" {
fmt.Printf("please enter the name of the json file that will be used to
update the json file:")
jsonFile := readInput()
fmt.Printf("please enter the ISO code of the locale for which you want to update the json file: ")
// we read the user's input
locale := readInput()
// we launch the script
scripts.AddMissingKeysToJsonFromJson(jsonFile, locale)
}
I can give you the command line I use for this code go run mis-t.go update-json-from-json
Do you what I'm missing in my code please ?
Presuming that the file contains dynamic and unknown keys and values, and you cannot model them in your application. Then you can do something like:
func main() {
if os.Args[1] == "update-json-from-json" {
...
jsonFile := readInput()
var jsonKeys interface{}
err := json.Unmarshal(jsonFile, &jsonKeys)
functions.Check(err)
...
}
}
to load the contents into the empty interface, and then use the go reflection library (https://golang.org/pkg/reflect/) to iterate over the fields, find their names and values and update them according to your needs.
The alternative is to Unmarshal into a map[string]string, but that won't cope very well with nested JSON, whereas this might (but I haven't tested it).

Load json file bundled within the executable [duplicate]

I'm working on a small web application in Go that's meant to be used as a tool on a developer's machine to help debug their applications/web services. The interface to the program is a web page that includes not only the HTML but some JavaScript (for functionality), images, and CSS (for styling). I'm planning on open-sourcing this application, so users should be able to run a Makefile, and all the resources will go where they need to go. However, I'd also like to be able to simply distribute an executable with as few files/dependencies as possible. Is there a good way to bundle the HTML/CSS/JS with the executable, so users only have to download and worry about one file?
Right now, in my app, serving a static file looks a little like this:
// called via http.ListenAndServe
func switchboard(w http.ResponseWriter, r *http.Request) {
// snipped dynamic routing...
// look for static resource
uri := r.URL.RequestURI()
if fp, err := os.Open("static" + uri); err == nil {
defer fp.Close()
staticHandler(w, r, fp)
return
}
// snipped blackhole route
}
So it's pretty simple: if the requested file exists in my static directory, invoke the handler, which simply opens the file and tries to set a good Content-Type before serving. My thought was that there's no reason this needs to be based on the real filesystem: if there were compiled resources, I could simply index them by request URI and serve them as such.
Let me know if there's not a good way to do this or I'm barking up the wrong tree by trying to do this. I just figured the end-user would appreciate as few files as possible to manage.
If there are more appropriate tags than go, please feel free to add them or let me know.
Starting with Go 1.16 the go tool has support for embedding static files directly in the executable binary.
You have to import the embed package, and use the //go:embed directive to mark what files you want to embed and into which variable you want to store them.
3 ways to embed a hello.txt file into the executable:
import "embed"
//go:embed hello.txt
var s string
print(s)
//go:embed hello.txt
var b []byte
print(string(b))
//go:embed hello.txt
var f embed.FS
data, _ := f.ReadFile("hello.txt")
print(string(data))
Using the embed.FS type for the variable you can even include multiple files into a variable that will provide a simple file-system interface:
// content holds our static web server content.
//go:embed image/* template/*
//go:embed html/index.html
var content embed.FS
The net/http has support to serve files from a value of embed.FS using http.FS() like this:
http.Handle("/static/", http.StripPrefix("/static/", http.FileServer(http.FS(content))))
The template packages can also parse templates using text/template.ParseFS(), html/template.ParseFS() functions and text/template.Template.ParseFS(), html/template.Template.ParseFS() methods:
template.ParseFS(content, "*.tmpl")
The following of the answer lists your old options (prior to Go 1.16).
Embedding Text Files
If we're talking about text files, they can easily be embedded in the source code itself. Just use the back quotes to declare the string literal like this:
const html = `
<html>
<body>Example embedded HTML content.</body>
</html>
`
// Sending it:
w.Write([]byte(html)) // w is an io.Writer
Optimization tip:
Since most of the times you will only need to write the resource to an io.Writer, you can also store the result of a []byte conversion:
var html = []byte(`
<html><body>Example...</body></html>
`)
// Sending it:
w.Write(html) // w is an io.Writer
Only thing you have to be careful about is that raw string literals cannot contain the back quote character (`). Raw string literals cannot contain sequences (unlike the interpreted string literals), so if the text you want to embed does contain back quotes, you have to break the raw string literal and concatenate back quotes as interpreted string literals, like in this example:
var html = `<p>This is a back quote followed by a dot: ` + "`" + `.</p>`
Performance is not affected, as these concatenations will be executed by the compiler.
Embedding Binary Files
Storing as a byte slice
For binary files (e.g. images) most compact (regarding the resulting native binary) and most efficient would be to have the content of the file as a []byte in your source code. This can be generated by 3rd party toos/libraries like go-bindata.
If you don't want to use a 3rd party library for this, here's a simple code snippet that reads a binary file, and outputs Go source code that declares a variable of type []byte that will be initialized with the exact content of the file:
imgdata, err := ioutil.ReadFile("someimage.png")
if err != nil {
panic(err)
}
fmt.Print("var imgdata = []byte{")
for i, v := range imgdata {
if i > 0 {
fmt.Print(", ")
}
fmt.Print(v)
}
fmt.Println("}")
Example output if the file would contain bytes from 0 to 16 (try it on the Go Playground):
var imgdata = []byte{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
Storing as base64 string
If the file is not "too large" (most images/icons qualify), there are other viable options too. You can convert the content of the file to a Base64 string and store that in your source code. On application startup (func init()) or when needed, you can decode it to the original []byte content. Go has nice support for Base64 encoding in the encoding/base64 package.
Converting a (binary) file to base64 string is as simple as:
data, err := ioutil.ReadFile("someimage.png")
if err != nil {
panic(err)
}
fmt.Println(base64.StdEncoding.EncodeToString(data))
Store the result base64 string in your source code, e.g. as a const.
Decoding it is just one function call:
const imgBase64 = "<insert base64 string here>"
data, err := base64.StdEncoding.DecodeString(imgBase64) // data is of type []byte
Storing as quoted string
More efficient than storing as base64, but may be longer in source code is storing the quoted string literal of the binary data. We can obtain the quoted form of any string using the strconv.Quote() function:
data, err := ioutil.ReadFile("someimage.png")
if err != nil {
panic(err)
}
fmt.Println(strconv.Quote(string(data))
For binary data containing values from 0 up to 64 this is how the output would look like (try it on the Go Playground):
"\x00\x01\x02\x03\x04\x05\x06\a\b\t\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !\"#$%&'()*+,-./0123456789:;<=>?"
(Note that strconv.Quote() appends and prepends a quotation mark to it.)
You can directly use this quoted string in your source code, for example:
const imgdata = "\x00\x01\x02\x03\x04\x05\x06\a\b\t\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !\"#$%&'()*+,-./0123456789:;<=>?"
It is ready to use, no need to decode it; the unquoting is done by the Go compiler, at compile time.
You may also store it as a byte slice should you need it like that:
var imgdata = []byte("\x00\x01\x02\x03\x04\x05\x06\a\b\t\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !\"#$%&'()*+,-./0123456789:;<=>?")
The go-bindata package looks like it might be what you're interested in.
https://github.com/go-bindata/go-bindata
It will allow you to convert any static file into a function call that can be embedded in your code and will return a byte slice of the file content when called.
Bundle React application
For example, you have a build output from react like the following:
build/favicon.ico
build/index.html
build/asset-manifest.json
build/static/css/**
build/static/js/**
build/manifest.json
When you use go:embed like this, it will serve the contents as http://localhost:port/build/index.html which is not what we want (unexpected /build).
//go:embed build/*
var static embed.FS
// ...
http.Handle("/", http.FileServer(http.FS(static)))
In fact, we will need to take one more step to make it works as expected by using fs.Sub:
package main
import (
"embed"
"io/fs"
"log"
"net/http"
)
//go:embed build/*
var static embed.FS
func main() {
contentStatic, _ := fs.Sub(static, "build")
http.Handle("/", http.FileServer(http.FS(contentStatic)))
log.Fatal(http.ListenAndServe("localhost:8080", nil))
}
Now, http://localhost:8080 should serve your web application as expected.
Credit to Amit Mittal.
Note: go:embed requires go 1.16 or higher.
also there is some exotic way - I use maven plugin to build GoLang projects and it allows to use JCP preprocessor to embed binary blocks and text files into sources. In the case code just look like line below (and some example can be found here)
var imageArray = []uint8{/*$binfile("./image.png","uint8[]")$*/}
As a popular alternative to go-bindata mentioned in another answer, mjibson/esc also embeds arbitrary files, but handles directory trees particularly conveniently.

Wrapping json member fields to object

My objective is to add fields to json on user request.
Everything is great, but when displaying the fields with
fmt.Printf("%s: %s\n", content.Date, content.Description)
an error occurs:
invalid character '{' after top-level value
And that is because after adding new fields the file looks like this:
{"Date":"2017-03-20 10:46:48","Description":"new"}
{"Date":"2017-03-20 10:46:51","Description":"new .go"}
The biggest problem is with the writting to file
reminder := &Name{dateString[:19], text} //text - input string
newReminder, _ := json.Marshal(&reminder)
I dont really know how to do this properly
My question is how should I wrap all member fields into one object?
And what is the best way to iterate through member fields?
The code is available here: https://play.golang.org/p/NunV_B6sud
You should store the reminders into an array inside the json file, as mentioned by #Gerben Jacobs, and then, every time you want to add a new reminder to the array you need to read the full contents of rem.json, append the new reminder in Go, truncate the file, and write the new slice into the file. Here's a quick implentation https://play.golang.org/p/UKR91maQF2.
If you have lots of reminders and the process of reading, decoding, encoding, and writing the whole content becomes a pain you could open the file, implement a way to truncate only the last ] from the file contents, and then write only , + new reminder + ].
So after some research, people in the go-nuts group helped me and suggested me to use a streaming json parser that parses items individually.
So I needed to change my reminder listing function:
func listReminders() error {
f, err := os.Open("rem.json")
if err != nil {
return err
}
dec := json.NewDecoder(f)
for {
var content Name
switch dec.Decode(&content) {
case nil:
fmt.Printf("%#v\n", content)
case io.EOF:
return nil
default:
return err
}
}
}
Now everything works the way I wanted.

How do I substitute bytes in a stream using Go standard library?

I have a io.Reader which I get from http.Request.Body that reads a JSON byte slice from a server.
I would like to stream this to json.NewDecoder. However I would also like to intercept the JSON before it hits json.NewDecoder and substitute certain parts of it. For example, the JSON string contains empty hashes "{}" which I would like to remove due to a bug in the server's JSON output.
I am currently achieving my goal using json.Unmarshal but not using the JSON streaming parser:
data, _ := ioutil.ReadAll(r.Body)
data = bytes.Replace(data, []byte("{}"), "", -1)
json.Unmarshal(data, [my struct])
How can I achieve the same thing as above but using json.NewDecoder so I can save the many times the above code has to parse through r.Body's data? Here's some code using a pseudo function ReplaceStream(r io.Reader, old, new []byte):
reader := ReplaceStream(r.Body, []byte("{}"), "")
dec := json.NewDecoder(reader)
dec.Decode([my struct])
I know ReplaceStream might be fairly trivial to make, but is there anything in the standard library to do this that I am unaware of?
My advice is to just treat that kind of message as a special case and avoid the extra parsing / substituting for all the other requests
data, _ := ioutil.ReadAll(r.Body)
// FIXME: overcome bug #12312 of json server
if data == `{"list": [{}]}` {
return []
}
// Normal datastruct ..

Remove invalid UTF-8 characters from a string

I get this on json.Marshal of a list of strings:
json: invalid UTF-8 in string: "...ole\xc5\"
The reason is obvious, but how can I delete/replace such strings in Go? I've been reading docst on unicode and unicode/utf8 packages and there seems no obvious/quick way to do it.
In Python for example you have methods for it where the invalid characters can be deleted, replaced by a specified character or strict setting which raises exception on invalid chars. How can I do equivalent thing in Go?
UPDATE: I meant the reason for getting an exception (panic?) - illegal char in what json.Marshal expects to be valid UTF-8 string.
(how the illegal byte sequence got into that string is not important, the usual way - bugs, file corruption, other programs that do not conform to unicode, etc)
In Go 1.13+, you can do this:
strings.ToValidUTF8("a\xc5z", "")
In Go 1.11+, it's also very easy to do the same using the Map function and utf8.RuneError like this:
fixUtf := func(r rune) rune {
if r == utf8.RuneError {
return -1
}
return r
}
fmt.Println(strings.Map(fixUtf, "a\xc5z"))
fmt.Println(strings.Map(fixUtf, "posic�o"))
Output:
az
posico
Playground: Here.
For example,
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
s := "a\xc5z"
fmt.Printf("%q\n", s)
if !utf8.ValidString(s) {
v := make([]rune, 0, len(s))
for i, r := range s {
if r == utf8.RuneError {
_, size := utf8.DecodeRuneInString(s[i:])
if size == 1 {
continue
}
}
v = append(v, r)
}
s = string(v)
}
fmt.Printf("%q\n", s)
}
Output:
"a\xc5z"
"az"
Unicode Standard
FAQ - UTF-8, UTF-16, UTF-32 & BOM
Q: Are there any byte sequences that are not generated by a UTF? How
should I interpret them?
A: None of the UTFs can generate every arbitrary byte sequence. For
example, in UTF-8 every byte of the form 110xxxxx2 must be followed
with a byte of the form 10xxxxxx2. A sequence such as <110xxxxx2
0xxxxxxx2> is illegal, and must never be generated. When faced with
this illegal byte sequence while transforming or interpreting, a UTF-8
conformant process must treat the first byte 110xxxxx2 as an illegal
termination error: for example, either signaling an error, filtering
the byte out, or representing the byte with a marker such as FFFD
(REPLACEMENT CHARACTER). In the latter two cases, it will continue
processing at the second byte 0xxxxxxx2.
A conformant process must not interpret illegal or ill-formed byte
sequences as characters, however, it may take error recovery actions.
No conformant process may use irregular byte sequences to encode
out-of-band information.
Another way to do this, according to this answer, could be
s = string([]rune(s))
Example:
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
s := "...ole\xc5"
fmt.Println(s, utf8.Valid([]byte(s)))
// Output: ...ole� false
s = string([]rune(s))
fmt.Println(s, utf8.Valid([]byte(s)))
// Output: ...ole� true
}
Even though the result doesn't look "pretty", it still nevertheless converts the string into a valid UTF-8 encoding.