Load json file bundled within the executable [duplicate] - json

I'm working on a small web application in Go that's meant to be used as a tool on a developer's machine to help debug their applications/web services. The interface to the program is a web page that includes not only the HTML but some JavaScript (for functionality), images, and CSS (for styling). I'm planning on open-sourcing this application, so users should be able to run a Makefile, and all the resources will go where they need to go. However, I'd also like to be able to simply distribute an executable with as few files/dependencies as possible. Is there a good way to bundle the HTML/CSS/JS with the executable, so users only have to download and worry about one file?
Right now, in my app, serving a static file looks a little like this:
// called via http.ListenAndServe
func switchboard(w http.ResponseWriter, r *http.Request) {
// snipped dynamic routing...
// look for static resource
uri := r.URL.RequestURI()
if fp, err := os.Open("static" + uri); err == nil {
defer fp.Close()
staticHandler(w, r, fp)
return
}
// snipped blackhole route
}
So it's pretty simple: if the requested file exists in my static directory, invoke the handler, which simply opens the file and tries to set a good Content-Type before serving. My thought was that there's no reason this needs to be based on the real filesystem: if there were compiled resources, I could simply index them by request URI and serve them as such.
Let me know if there's not a good way to do this or I'm barking up the wrong tree by trying to do this. I just figured the end-user would appreciate as few files as possible to manage.
If there are more appropriate tags than go, please feel free to add them or let me know.

Starting with Go 1.16 the go tool has support for embedding static files directly in the executable binary.
You have to import the embed package, and use the //go:embed directive to mark what files you want to embed and into which variable you want to store them.
3 ways to embed a hello.txt file into the executable:
import "embed"
//go:embed hello.txt
var s string
print(s)
//go:embed hello.txt
var b []byte
print(string(b))
//go:embed hello.txt
var f embed.FS
data, _ := f.ReadFile("hello.txt")
print(string(data))
Using the embed.FS type for the variable you can even include multiple files into a variable that will provide a simple file-system interface:
// content holds our static web server content.
//go:embed image/* template/*
//go:embed html/index.html
var content embed.FS
The net/http has support to serve files from a value of embed.FS using http.FS() like this:
http.Handle("/static/", http.StripPrefix("/static/", http.FileServer(http.FS(content))))
The template packages can also parse templates using text/template.ParseFS(), html/template.ParseFS() functions and text/template.Template.ParseFS(), html/template.Template.ParseFS() methods:
template.ParseFS(content, "*.tmpl")
The following of the answer lists your old options (prior to Go 1.16).
Embedding Text Files
If we're talking about text files, they can easily be embedded in the source code itself. Just use the back quotes to declare the string literal like this:
const html = `
<html>
<body>Example embedded HTML content.</body>
</html>
`
// Sending it:
w.Write([]byte(html)) // w is an io.Writer
Optimization tip:
Since most of the times you will only need to write the resource to an io.Writer, you can also store the result of a []byte conversion:
var html = []byte(`
<html><body>Example...</body></html>
`)
// Sending it:
w.Write(html) // w is an io.Writer
Only thing you have to be careful about is that raw string literals cannot contain the back quote character (`). Raw string literals cannot contain sequences (unlike the interpreted string literals), so if the text you want to embed does contain back quotes, you have to break the raw string literal and concatenate back quotes as interpreted string literals, like in this example:
var html = `<p>This is a back quote followed by a dot: ` + "`" + `.</p>`
Performance is not affected, as these concatenations will be executed by the compiler.
Embedding Binary Files
Storing as a byte slice
For binary files (e.g. images) most compact (regarding the resulting native binary) and most efficient would be to have the content of the file as a []byte in your source code. This can be generated by 3rd party toos/libraries like go-bindata.
If you don't want to use a 3rd party library for this, here's a simple code snippet that reads a binary file, and outputs Go source code that declares a variable of type []byte that will be initialized with the exact content of the file:
imgdata, err := ioutil.ReadFile("someimage.png")
if err != nil {
panic(err)
}
fmt.Print("var imgdata = []byte{")
for i, v := range imgdata {
if i > 0 {
fmt.Print(", ")
}
fmt.Print(v)
}
fmt.Println("}")
Example output if the file would contain bytes from 0 to 16 (try it on the Go Playground):
var imgdata = []byte{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
Storing as base64 string
If the file is not "too large" (most images/icons qualify), there are other viable options too. You can convert the content of the file to a Base64 string and store that in your source code. On application startup (func init()) or when needed, you can decode it to the original []byte content. Go has nice support for Base64 encoding in the encoding/base64 package.
Converting a (binary) file to base64 string is as simple as:
data, err := ioutil.ReadFile("someimage.png")
if err != nil {
panic(err)
}
fmt.Println(base64.StdEncoding.EncodeToString(data))
Store the result base64 string in your source code, e.g. as a const.
Decoding it is just one function call:
const imgBase64 = "<insert base64 string here>"
data, err := base64.StdEncoding.DecodeString(imgBase64) // data is of type []byte
Storing as quoted string
More efficient than storing as base64, but may be longer in source code is storing the quoted string literal of the binary data. We can obtain the quoted form of any string using the strconv.Quote() function:
data, err := ioutil.ReadFile("someimage.png")
if err != nil {
panic(err)
}
fmt.Println(strconv.Quote(string(data))
For binary data containing values from 0 up to 64 this is how the output would look like (try it on the Go Playground):
"\x00\x01\x02\x03\x04\x05\x06\a\b\t\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !\"#$%&'()*+,-./0123456789:;<=>?"
(Note that strconv.Quote() appends and prepends a quotation mark to it.)
You can directly use this quoted string in your source code, for example:
const imgdata = "\x00\x01\x02\x03\x04\x05\x06\a\b\t\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !\"#$%&'()*+,-./0123456789:;<=>?"
It is ready to use, no need to decode it; the unquoting is done by the Go compiler, at compile time.
You may also store it as a byte slice should you need it like that:
var imgdata = []byte("\x00\x01\x02\x03\x04\x05\x06\a\b\t\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !\"#$%&'()*+,-./0123456789:;<=>?")

The go-bindata package looks like it might be what you're interested in.
https://github.com/go-bindata/go-bindata
It will allow you to convert any static file into a function call that can be embedded in your code and will return a byte slice of the file content when called.

Bundle React application
For example, you have a build output from react like the following:
build/favicon.ico
build/index.html
build/asset-manifest.json
build/static/css/**
build/static/js/**
build/manifest.json
When you use go:embed like this, it will serve the contents as http://localhost:port/build/index.html which is not what we want (unexpected /build).
//go:embed build/*
var static embed.FS
// ...
http.Handle("/", http.FileServer(http.FS(static)))
In fact, we will need to take one more step to make it works as expected by using fs.Sub:
package main
import (
"embed"
"io/fs"
"log"
"net/http"
)
//go:embed build/*
var static embed.FS
func main() {
contentStatic, _ := fs.Sub(static, "build")
http.Handle("/", http.FileServer(http.FS(contentStatic)))
log.Fatal(http.ListenAndServe("localhost:8080", nil))
}
Now, http://localhost:8080 should serve your web application as expected.
Credit to Amit Mittal.
Note: go:embed requires go 1.16 or higher.

also there is some exotic way - I use maven plugin to build GoLang projects and it allows to use JCP preprocessor to embed binary blocks and text files into sources. In the case code just look like line below (and some example can be found here)
var imageArray = []uint8{/*$binfile("./image.png","uint8[]")$*/}

As a popular alternative to go-bindata mentioned in another answer, mjibson/esc also embeds arbitrary files, but handles directory trees particularly conveniently.

Related

How we can read a JSON file with Go Programming Language?

I'm working on a translation project on my Angular app. I already create all the different keys for that. I try now to use Go Programming Language to add some functionalities in my translation, to work quickly after.
I try to code a function in Go Programming Language in order to read an input user on the command line. I need to read this input file in order to know if there is missing key inside. This input user must be a JSON file. I have a problem with this function, is blocked at functions.Check(err), in order to debug my function I displayed the different variable with fmt.Printf(variable to display).
I call this function readInput() in my main function.
The readInput() function is the following :
// this function is used to read the user's input on the command line
func readInput() string {
// we create a reader
reader := bufio.NewReader(os.Stdin)
// we read the user's input
answer, err := reader.ReadString('\n')
// we check if any errors have occured while reading
functions.Check(err)
// we trim the "\n" from the answer to only keep the string input by the user
answer = strings.Trim(answer, "\n")
return answer
}
In my main function I call readInput() for a specific command I created. This command line is usefull to update a JSON file and add a missing key automatically.
My func main is :
func main() {
if os.Args[1] == "update-json-from-json" {
fmt.Printf("please enter the name of the json file that will be used to
update the json file:")
jsonFile := readInput()
fmt.Printf("please enter the ISO code of the locale for which you want to update the json file: ")
// we read the user's input
locale := readInput()
// we launch the script
scripts.AddMissingKeysToJsonFromJson(jsonFile, locale)
}
I can give you the command line I use for this code go run mis-t.go update-json-from-json
Do you what I'm missing in my code please ?
Presuming that the file contains dynamic and unknown keys and values, and you cannot model them in your application. Then you can do something like:
func main() {
if os.Args[1] == "update-json-from-json" {
...
jsonFile := readInput()
var jsonKeys interface{}
err := json.Unmarshal(jsonFile, &jsonKeys)
functions.Check(err)
...
}
}
to load the contents into the empty interface, and then use the go reflection library (https://golang.org/pkg/reflect/) to iterate over the fields, find their names and values and update them according to your needs.
The alternative is to Unmarshal into a map[string]string, but that won't cope very well with nested JSON, whereas this might (but I haven't tested it).

How to read a non-UTF8 encoded csv file?

With the csv crate and the latest Rust version 1.31.0, I would want to read CSV files with ANSI (Windows 1252) encoding, as easily as in UTF-8.
Things I have tried (with no luck), after reading the whole file in a Vec<u8>:
CString
OsString
Indeed, in my company, we have a lot of CSV files, ANSI encoded.
Also, I would want, if possible, not to load the entire file in a Vec<u8> but a reading line by line (CRLF ending), as many of the files are big (50 Mb or more…).
In the file Cargo.toml, I have this dependency:
[dependencies]
csv = "1"
test.csv consist of the following content saved as Windows-1252 encoding:
Café;au;lait
Café;au;lait
The code in main.rs file:
extern crate csv;
use std::error::Error;
use std::fs::File;
use std::io::BufReader;
use std::path::Path;
use std::process;
fn example() -> Result<(), Box<Error>> {
let file_name = r"test.csv";
let file_handle = File::open(Path::new(file_name))?;
let reader = BufReader::new(file_handle);
let mut rdr = csv::ReaderBuilder::new()
.delimiter(b';')
.from_reader(reader);
// println!("ANSI");
// for result in rdr.byte_records() {
// let record = result?;
// println!("{:?}", record);
// }
println!("UTF-8");
for result in rdr.records() {
let record = result?;
println!("{:?}", record);
}
Ok(())
}
fn main() {
if let Err(err) = example() {
println!("error running example: {}", err);
process::exit(1);
}
}
The output is:
UTF-8
error running example: CSV parse error: record 0 (line 1, field: 0, byte: 0): invalid utf-8: invalid UTF-8 in field 0 near byte index 3
error: process didn't exit successfully: `target\debug\test-csv.exe` (exit code: 1)
When using rdr.byte_records() (uncommenting the relevant part of code), the output is:
ANSI
ByteRecord(["Caf\\xe9", "au", "lait"])
I suspect this question is under specified. In particular, it's not clear why your use of the ByteRecord API is insufficient. In the csv crate, byte records specifically exists for exactly cases like this, where your CSV data isn't strictly UTF-8, but is in an alternative encoding such as Windows-1252 that is ASCII compatible. (An ASCII compatible encoding is an encoding in which ASCII is a subset. Windows-1252 and UTF-8 are both ASCII compatible. UTF-16 is not.) Your code sample above shows that you're using byte records, but doesn't explain why this is insufficient.
With that said, if your goal is to get your data into Rust's string data type (String/&str), then your only option is to transcode the contents of your CSV data from Windows-1252 to UTF-8. This is necessary because Rust's string data type uses UTF-8 for its in-memory representation. You cannot have a Rust String/&str that is Windows-1252 encoded because Windows-1252 is not a subset of UTF-8.
Other comments have recommended the use of the encoding crate. However, I would instead recommend the use of encoding_rs, if your use case aligns with the same use cases solved by the Encoding Standard, which is specifically geared towards the web. Fortunately, I believe such an alignment exists.
In order to satisfy your criteria for reading CSV data in a streaming fashion without first loading the entire contents into memory, you need to use a wrapper around the encoding_rs crate that implements streaming decoding for you. The encoding_rs_io crate provides this for you. (It's used inside of ripgrep to do fast streaming decoding before searching UTF-8.)
Here is an example program that puts all of the above together, using Rust 2018:
use std::fs::File;
use std::process;
use encoding_rs::WINDOWS_1252;
use encoding_rs_io::DecodeReaderBytesBuilder;
fn main() {
if let Err(err) = try_main() {
eprintln!("{}", err);
process::exit(1);
}
}
fn try_main() -> csv::Result<()> {
let file = File::open("test.csv")?;
let transcoded = DecodeReaderBytesBuilder::new()
.encoding(Some(WINDOWS_1252))
.build(file);
let mut rdr = csv::ReaderBuilder::new()
.delimiter(b';')
.from_reader(transcoded);
for result in rdr.records() {
let r = result?;
println!("{:?}", r);
}
Ok(())
}
with the Cargo.toml:
[package]
name = "so53826986"
version = "0.1.0"
edition = "2018"
[dependencies]
csv = "1"
encoding_rs = "0.8.13"
encoding_rs_io = "0.1.3"
And the output:
$ cargo run --release
Compiling so53826986 v0.1.0 (/tmp/so53826986)
Finished release [optimized] target(s) in 0.63s
Running `target/release/so53826986`
StringRecord(["Café", "au", "lait"])
In particular, if you swap out rdr.records() for rdr.byte_records(), then we can see more clearly what happened:
$ cargo run --release
Compiling so53826986 v0.1.0 (/tmp/so53826986)
Finished release [optimized] target(s) in 0.61s
Running `target/release/so53826986`
ByteRecord(["Caf\\xc3\\xa9", "au", "lait"])
Namely, your input contained Caf\xE9, but the byte record now contains Caf\xC3\xA9. This is a result of translating the Windows-1252 codepoint value of 233 (encoded as its literal byte, \xE9) to U+00E9 LATIN SMALL LETTER E WITH ACUTE, which is UTF-8 encoded as \xC3\xA9.

Why does json.Encoder add an extra line?

json.Encoder seems to behave slightly different than json.Marshal. Specifically it adds a new line at the end of the encoded value. Any idea why is that? It looks like a bug to me.
package main
import "fmt"
import "encoding/json"
import "bytes"
func main() {
var v string
v = "hello"
buf := bytes.NewBuffer(nil)
json.NewEncoder(buf).Encode(v)
b, _ := json.Marshal(&v)
fmt.Printf("%q, %q", buf.Bytes(), b)
}
This outputs
"\"hello\"\n", "\"hello\""
Try it in the Playground
Because they explicitly added a new line character when using Encoder.Encode. Here's the source code to that func, and it actually states it adds a newline character in the documentation (see comment, which is the documentation):
https://golang.org/src/encoding/json/stream.go?s=4272:4319
// Encode writes the JSON encoding of v to the stream,
// followed by a newline character.
//
// See the documentation for Marshal for details about the
// conversion of Go values to JSON.
func (enc *Encoder) Encode(v interface{}) error {
if enc.err != nil {
return enc.err
}
e := newEncodeState()
err := e.marshal(v)
if err != nil {
return err
}
// Terminate each value with a newline.
// This makes the output look a little nicer
// when debugging, and some kind of space
// is required if the encoded value was a number,
// so that the reader knows there aren't more
// digits coming.
e.WriteByte('\n')
if _, err = enc.w.Write(e.Bytes()); err != nil {
enc.err = err
}
encodeStatePool.Put(e)
return err
}
Now, why did the Go developers do it other than "makes the output look a little nice"? One answer:
Streaming
The go json Encoder is optimized for streaming (e.g. MB/GB/PB of json data). It is typical that when streaming you need a way to deliminate when your stream has completed. In the case of Encoder.Encode(), that is a \n newline character. Sure, you can certainly write to a buffer. But you can also write to an io.Writer which would stream the block of v.
This is opposed to the use of json.Marshal which is generally discouraged if your input is from an untrusted (and unknown limited) source (e.g. an ajax POST method to your web service - what if someone posts a 100MB json file?). And, json.Marshal would be a final complete set of json - e.g. you wouldn't expect to concatenate a few 100 Marshal entries together. You'd use Encoder.Encode() for that to build a large set and write to the buffer, stream, file, io.Writer, etc.
Whenever in doubt if it's a bug, I always lookup the source - that's one of the advantages to Go, it's source and compiler is just pure Go. Within [n]vim I use \gb to open the source definition in a browser with my .vimrc settings.
You can erease the newline by backward stream:
f, _ := os.OpenFile(fname, ...)
encoder := json.NewEncoder(f)
encoder.Encode(v)
f.Seek(-1, 1)
f.WriteString("other data ...")
They should let user control this strange behavior:
a build option to disable it
Encoder.SetEOF(eof string)
Encoder.SetIndent(prefix, indent, eof string)
The Encoder writes a stream of documents. The extra whitespace terminates a JSON document in the stream.
A terminator is required for stream readers. Consider a stream containing these JSON documents: 1, 2, 3. Without the extra whitespace, the data on the wire is the sequence of bytes 123. This is a single JSON document with the number 123, not three documents.

How do I do this? Get an struct send to a func as interface and get it back as struct?

I'm new in golang development and have some question regarding something related to this question.
As a learning exercise, I'm trying to create a simple library to handle json based configuration file. As a configuration file to be used for more then one app, it should be able to handle different parameters. Then I have created a type struct Configuration that has the filename and a data interface. Each app will have a struct based on its configuration needs.
In the code bellow, I put all together (lib and "main code") and the "TestData struct" is the "app parameters".
If it doesn't exists, it will set a default values and create the file, and it is working. But when I try to read the file. I try to decode the json and put it back into the data interface. But it is giving me an error and I couldn't figure out how to solve this. Can someone help on this?
[updated] I didn't put the targeted code before, because I though that it would be easier to read in in all as a single program. Bellow is the 'targeted code' for better view of the issue.
As I will not be able to use the TestData struct inside the library, since it will change from program to program, the only way to handle this was using interface. Is there a better way?
library config
package config
import (
"encoding/json"
"fmt"
"os"
)
// Base configuration struct
type Configuration struct {
Filename string
Data interface{}
}
func (c *Configuration) Create(cData *Configuration) bool {
cFile, err := os.Open(cData.Filename)
defer cFile.Close()
if err == nil {
fmt.Println("Error(1) trying to create a configuration file. File '", cData.Filename, "' may already exist...")
return false
}
cFile, err = os.Create(cData.Filename)
if err != nil {
fmt.Println("Error(2) trying to create a configuration file. File '", cData.Filename, "' may already exist...")
return false
}
buffer, _ := json.MarshalIndent(cData.Data, "", "")
cFile.Write(buffer)
return true
}
func (c *Configuration) Read(cData *Configuration) bool {
cFile, err := os.Open(cData.Filename)
defer cFile.Close()
if err != nil {
fmt.Println("Error(1) trying to read a configuration file. File '", cData.Filename, "' may not already exist...")
return false
}
jConfig := json.NewDecoder(cFile)
jerr := jConfig.Decode(&cData.Data)
if jerr != nil {
panic(jerr)
}
return true
}
program using library config
package main
import (
"fmt"
"./config"
)
// struct basic para configuração
type TestData struct {
URL string
Port string
}
func main() {
var Config config.Configuration
Config.Filename = "config.json"
if !Config.Read(&Config) {
Config.Data = TestData{"http", "8080"}
Config.Create(&Config)
}
fmt.Println(Config.Data)
TestData1 := &TestData{}
TestData1 = Config.Data.(*TestData) // error, why?
fmt.Println(TestData1.URL)
}
NEW UPDATE:
I have made some changes after JimB comment about I'm not clear about some concepts and I tried to review it. Sure many things aren't clear for me yet unfortunately. The "big" understanding I believe I got, but what mess my mind up is the "ins" and "outs" of values and formats and pointers, mainly when it goes to other libraries. I'm not able yet to follow the "full path" of it.
Yet, I believe I had some improvement on my code.
I think that I have corrected some points, but still have some big questions:
I stopped sending "Configuration" as a parameter as all "data" were already there as they are "thenselfs" in the instance. Right?
Why do I have use reference in the line 58 (Config.Data = &TestData{})
Why to I have to use pointer in the line 64 (tmp := Config.Data.(*TestData)
Why I CANNOT use reference in line 69 (Config.Data = tmp)
Thanks
The reason you are running into an error is because you are trying to decode into an interface{} type. When dealing with JSON objects, they are decoded by the encoding/json package into map[string]interface{} types by default. This is causing the type assertion to fail since the memory structure for a map[string]interface{} is much different than that of a struct.
The better way to do this is to make your TestData struct the expected data format for your Configuration struct:
// Base configuration struct
type Configuration struct {
Filename string
Data *TestData
}
Then when Decoding the file data, the package will unmarshal the data into the fields that match the closest with the data it finds.
If you need more control over the data unmarshaling process, you can dictate which JSON fields get decoded into which struct members by using struct tags. You can read more about the json struct tags available here: https://golang.org/pkg/encoding/json/#Marshal
You are trying to assert that Config.Data is of type *TestData, but you're assigning it to TestData{"http", "8080"} above. You can take the address of a composite literal to create a pointer:
Config.Data = &TestData{"http", "8080"}
If your config already exsits, your Read method is going to fill in the Data field with the a default json data type, probably a map[string]interface{}. If you assign a pointer of the correct type to Data first, it will decode into the expected type.
Config.Data = &TestData{}
Ans since Data is an interface{}, you do not want to ever use a pointer to that value, so don't use the & operator when marshaling and unmarshaling.

How do I substitute bytes in a stream using Go standard library?

I have a io.Reader which I get from http.Request.Body that reads a JSON byte slice from a server.
I would like to stream this to json.NewDecoder. However I would also like to intercept the JSON before it hits json.NewDecoder and substitute certain parts of it. For example, the JSON string contains empty hashes "{}" which I would like to remove due to a bug in the server's JSON output.
I am currently achieving my goal using json.Unmarshal but not using the JSON streaming parser:
data, _ := ioutil.ReadAll(r.Body)
data = bytes.Replace(data, []byte("{}"), "", -1)
json.Unmarshal(data, [my struct])
How can I achieve the same thing as above but using json.NewDecoder so I can save the many times the above code has to parse through r.Body's data? Here's some code using a pseudo function ReplaceStream(r io.Reader, old, new []byte):
reader := ReplaceStream(r.Body, []byte("{}"), "")
dec := json.NewDecoder(reader)
dec.Decode([my struct])
I know ReplaceStream might be fairly trivial to make, but is there anything in the standard library to do this that I am unaware of?
My advice is to just treat that kind of message as a special case and avoid the extra parsing / substituting for all the other requests
data, _ := ioutil.ReadAll(r.Body)
// FIXME: overcome bug #12312 of json server
if data == `{"list": [{}]}` {
return []
}
// Normal datastruct ..