Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am processing multiple .json files which I need to add to a single .zip archive using a package available here: https://github.com/larzconwell/bzip2.
I have referenced other possible solutions and questions related to io.Writer along with .Close() and .Flush()
Code that is used:
if processedCounter%*filesInPackage == 0 || filesLeftToProcess == 0 {
// Create empty zip file with numbered filename.
emptyZip, err := os.Create(filepath.Join(absolutePathOutputDirectory, "package_"+strconv.Itoa(packageCounter)+".zip"))
if err != nil {
panic(err)
}
// Get list of .json filenames to be packaged:
listOfProcessedJSON := listFiles(absolutePathInterDirectory, ".json")
bzipWriter, err := bzip2.NewWriterLevel(emptyZip, 1)
if err != nil {
panic(err)
}
defer bzipWriter.Close()
// Add listed files to the archive
for _, file := range listOfProcessedJSON {
// Read byte array from json file:
JSONContents, err := ioutil.ReadFile(file)
if err != nil {
fmt.Printf("Failed to open %s: %s", file, err)
}
// Write a single JSON to .zip:
// Process hangs here!
_, compressionError := bzipWriter.Write(JSONContents)
if compressionError != nil {
fmt.Printf("Failed to write %s to zip: %s", file, err)
compressionErrorCounter++
}
err = bzipWriter.Close()
if err != nil {
fmt.Printf("Failed to Close bzipWriter")
}
}
// Delete intermediate .json files
dir, err := ioutil.ReadDir(absolutePathInterDirectory)
for _, d := range dir {
os.RemoveAll(filepath.Join([]string{"tmp", d.Name()}...))
}
packageCounter++
}
Using debugger it seems that the my program hangs on the following line:
_, compressionError := bzipWriter.Write(JSONContents)
The package itself does not provide usage examples so my knowledge is based on studying documentation, StackOverflow questions, and different available articles e.g.:
https://www.golangprograms.com/go-program-to-compress-list-of-files-into-zip.html
Let me know if anyone knows a possible solution to this problem.
You are confusing the formats and what they do, likely because they contain a the common substring "zip". zip is an archive format, intended to contain multiple files. bzip2 is a single-stream compressor, not an archive format, and can store only one file. gzip is the same as bzip2 in that regard. gzip, bzip2, xz, and other single-file compressors are all commonly used with tar in order to archive multiple files and their directory structure. tar collects the multiple files and structure into a single, uncompressed file, which is then compressed by the compressor of your choice.
The zip format works differently, where the archive format is on the outside, and each entry in the archive is individually compressed.
In any case, using a bzip2 package by itself will not be able to archive multiple files.
Related
I am trying to pull data on mails coming into an API from an email testing tool mailhog.
If I use a call to get a list of emails e.g
GET /api/v1/messages
I can load this data into a struct with no issues and print out values I need.
However if I use a different endpoint that is essentially a stream of new emails coming in, I have different behavior. When I run my go application I get no output whatsoever.
Do I need to do like a while loop to constantly listen to the endpoint to get the output?
My end goal is to pull some information from emails as they come in and then pass them into a different function.
Here is me trying to access the streaming endpoint
https://github.com/mailhog/MailHog/blob/master/docs/APIv1.md
res, err := http.Get("http://localhost:8084/api/v1/events")
if err != nil {
panic(err.Error())
}
body, err := ioutil.ReadAll(res.Body)
if err != nil {
panic(err.Error())
}
var data Email
json.Unmarshal(body, &data)
fmt.Printf("Email: %v\n", data)
If I do a curl request at the mailhog service with the same endpoint, I do get output as mails come in. However I cant seem to figure out why I am getting no output via my go app. The app does stay running just I dont get any output.
I am new to Go so apologies if this is a really simple question
From ioutil.ReadAll documentation:
ReadAll reads from r until an error or EOF and returns the data it read.
When you use to read the body of a regular endpoint, it works because the payload has an EOF: the server uses the header Content-Length to tell how many bytes the body response has, and once the client read that many bytes, it understands that it has read all of the body and can stop.
Your "streaming" endpoint doesn't use Content-Length though, because the body has an unknown size, it's supposed to write events as they come, so you can't use ReadAll in this case. Usually, in this case, you are supposed to read line-by-line, where each line represents an event. bufio.Scanner does exactly that:
res, err := http.Get("http://localhost:8084/api/v1/events")
if err != nil {
panic(err.Error())
}
scanner := bufio.NewScanner(res.Body)
for e.scanner.Scan() {
if err := e.scanner.Err(); err != nil {
panic(err.Error())
}
event := e.scanner.Bytes()
var data Email
json.Unmarshal(event, &data)
fmt.Printf("Email: %v\n", data)
}
curl can process the response as you expect because it checks that the endpoint will stream data, so it reacts accordinly. It may be helpful to add the response curl gets to the question.
Consider a small Go application that reads a large JSON file 2GB+, marshals the JSON data into a struct, and POSTs the JSON data to a web service endpoint.
The web service receiving the payload changed its functionality, and now has a limit of 25MB per payload. What would be the best approach to overcome this issue using Go? I've thought of the following, however I'm not sure it is the best approach:
Creating a function to split the large JSON file into multiple smaller ones (up to 20MB), and then iterate over the files sending multiple smaller requests.
Similar function to the one being used to currently send the entire JSON payload:
func sendDataToService(data StructData) {
payload, err := json.Marshal(data)
if err != nil {
log.Println("ERROR:", err)
}
request, err := http.NewRequest("POST", endpoint, bytes.NewBuffer(payload))
if err != nil {
log.Println("ERROR:", err)
}
client := &http.Client{}
response, err := client.Do(request)
log.Println("INFORMATIONAL:", request)
if err != nil {
log.Println("ERROR:", err)
}
defer response.Body.Close()
}
You can break the input into chunks and send each piece individually:
dec := json.NewDecoder(inputStream)
tok, err := dec.Token()
if err != nil {
return err
}
if tok == json.Delim('[') {
for {
var obj json.RawMessage
if err := dec.Decode(&obj); err != nil {
return err
}
// Here, obj contains one element of the array. You can send this
// to the server.
if !dec.More() {
break
}
}
}
As the server-side can process data progressively, I assume that the large JSON object can be split into smaller pieces. From this point, I can propose several options.
Use HTTP requests
Pros: Pretty simple to implement on the client-side.
Cons: Making hundreds of HTTP requests might be slow. You will also need to handle timeouts - this is additional complexity.
Use WebSocket messages
If the receiving side supports WebSockets, a step-by-step flow will look like this:
Split the input data into smaller pieces.
Connect to the WebSocket server.
Start sending messages with the smaller pieces till the end of the file.
Close connection to the server.
This solution might be more performant as you won't need to connect and disconnect from the server each time you send a message, as you'd do with HTTP.
However, both solutions suppose that you need to assemble all pieces on the server-side. For example, you would probably need to send along with the data a correlation ID to let the server know what file you are sending right now and a specific end-of-file message to let the server know when the file ends. In the case of the WebSocket server, you could assume that the entire file is sent during a single connection session if it is relevant.
There is a field in the database table that receives images in blob format. How can this be displayed on the site? The main goal is to send images to the database, and display them in the website.It would be great if you gave an example of the go code
this is insert data code:
ins, err := db.Query(fmt.Sprintf("INSERT INTO `photo` (`photo`)" +" VALUES('%s')", img))
if err != nil {
panic(err)
}
defer ins.Close()
attempt to display an image(saving in a variable):
vars := mux.Vars(r)
res, err := db.Query(fmt.Sprintf("SELECT * FROM `photo` WHERE `id` = '%s'", vars["id"]))
if err != nil {
panic(err)
}
showPhoto = Photo{}
for res.Next() {
var post Photo
err = res.Scan(&post.Id, &post.Img)
if err != nil {
panic(err)
}
encodeImg, err := b64.StdEncoding.DecodeString(post.Img)
showPhoto = post
}
several files are sent from one input, so the terminal displays the error " 1 variable, but in base64.stdencoding.decodeString returns 2 values"
You should never ever save the images directly into the database. The size of your database will increase a lot. You should save all of your images on the server, and save to the database only the path to that image file.
Another solution, which is actually great and easy to implement is using DigitalOcean Spaces It is really cheap for what you get.
Maybe you can search for some articles on the internet to see why you should not store images directly to the database, like this one
sample solition: save photo as file in any directory. and save path of that image in database.
then just show that image in html as:<img src="/static/{{.}}"/>
Don't forget the static file server. Images must be submitted on the same endpoint, something like this: app.StaticFiles("/static", "path/to/imagesDir")
I'm still learning the go language, but I've been trying to find some practical things to work on to get a better handle on it. Currently, I'm trying to build a simple program that goes to a youtube channel and returns some information by taking the public JSON and unmarshalling it.
Thus far I've tried making a completely custom struct that only has a few fields in it, but that doesn't seem to pull in any values. I've also tried using tools like https://mholt.github.io/json-to-go/ and getting the "real" struct that way. The issue with that method is there are numerous duplicates and I don't know enough to really assess how to tackle that.
This is an example JSON (I apologize for its size) https://pastebin.com/6u0b39tU
This is the struct that I get from the above tool: https://pastebin.com/3ZCu96st
the basic pattern of code I've tried is:
jsonFile, err := os.Open("test.json")
if err != nil {
fmt.Println("Couldn't open file", err)
}
defer jsonFile.Close()
bytes, _ := ioutil.ReadAll(jsonFile)
var channel Autogenerated
json.Unmarshal(bytes, &Autogenerated)
if err != nil {
fmt.Println("Failed to Unmarshal", err)
}
fmt.Println(channel.Fieldname)
Any feedback on the correct approach for how to handle something like this would be great. I get the feeling I'm just completely missing something.
In your code, you are not unmarshaling into the channel variable. Furthermore, you can optimize your code to not use ReadAll. Also, don't forget to check for errors (all errors).
Here is an improvement to your code.
jsonFile, err := os.Open("test.json")
if err != nil {
log.Fatalf("could not open file: %v", err)
}
defer jsonFile.Close()
var channel Autogenerated
if err := json.NewDecoder(jsonFile).Decode(&channel); err != nil {
log.Fatalf("failed to parse json: %v", err)
}
fmt.Println(channel.Fieldname)
Notice how a reference to channel is passed to Decode.
What is the best way to read large CSV files, at the moment I am reading one record at a time rather than using ReadAll().
reader := csv.NewReader(csvFile)
reader.FieldsPerRecord = -1
for {
// read just one record at a time
record, err := reader.Read()
if err == io.EOF {
break
} else if err != nil {
checkErr(err)
return
}
Is there a better way to do this to save memory?
I am writing each record/row to a database by sending an array over GRPC to a separate service.
Yes, there is one option you can use to improve it.
It is possible to allow reader to reuse a slice that is returned by it on each Read method call.
To do it you need to set reader.ReuseRecord = true.
But be careful, because the returned slice may be changed after the next call of Read!