Process csv file from upload - csv

I have a gin application that receives a post request containing a csv file which I want to read without saving it. I'm stuck here trying to read from the post request with the following error message: cannot use file (variable of type *multipart.FileHeader) as io.Reader value in argument to csv.NewReader: missing method Read
file, err := c.FormFile("file")
if err != nil {
errList["Invalid_body"] = "Unable to get request"
c.JSON(http.StatusUnprocessableEntity, gin.H{
"status": http.StatusUnprocessableEntity,
"error": errList,
})
}
r := csv.NewReader(file) // <= Error message
records, err := r.ReadAll()
for _, record := range records {
fmt.Println(record)
}
Is there a good example that I could use?

first read the file and header
csvPartFile, csvHeader, openErr := r.FormFile("file")
if openErr != nil {
// handle error
}
then read the lines from the file
csvLines, readErr := csv.NewReader(csvPartFile).ReadAll()
if readErr != nil {
//handle error
}
you can go through the lines looping through the records
for _, line := range csvLines {
fmt.Println(line)
}

As other answers have mentioned, you should Open() it first.
The latest version of gin.Context.FromFile(string) seems to return only two values.
This worked for me:
func (c *gin.Context) {
file_ptr, err := c.FormFile("file")
if err != nil {
log.Println(err.Error())
c.Status(http.StatusUnprocessableEntity)
return
}
log.Println(file_ptr.Filename)
file, err := file_ptr.Open()
if err != nil {
log.Println(err.Error())
c.Status(http.StatusUnprocessableEntity)
return
}
defer file.Close()
records, err := csv.NewReader(file).ReadAll()
if err != nil {
log.Println(err.Error())
c.Status(http.StatusUnprocessableEntity)
return
}
for _, line := range records {
fmt.Println(line)
}
}

Related

Exporting JSON into single file from loop function

I wrote some code which hits one public API and saves the JSON output in a file. But the data is storing line by line into the file instead of a single JSON format.
For eg.
Current Output:
{"ip":"1.1.1.1", "Country":"US"}
{"ip":"8.8.8.8", "Country":"IN"}
Desired Output:
[
{"ip":"1.1.1.1", "Country":"US"},
{"ip":"8.8.8.8", "Country":"IN"}
]
I know this should be pretty simple and i am missing out something.
My Current Code is:
To read IP from file and hit the API one by one on each IP.
func readIPfromFile(filename string, outFile string, timeout int) {
data := jsonIn{}
//open input file
jsonFile, err := os.Open(filename) //open input file
...
...
jsonData := bufio.NewScanner(jsonFile)
for jsonData.Scan() {
// marshal json data & check for logs
if err := json.Unmarshal(jsonData.Bytes(), &data); err != nil {
log.Fatal(err)
}
//save to file
url := fmt.Sprintf("http://ipinfo.io/%s", data.Host)
GetGeoIP(url, outFile, timeout)
}
}
To make HTTP Request with custom request header and call write to file function.
func GetGeoIP(url string, outFile string, timeout int) {
geoClient := http.Client{
Timeout: time.Second * time.Duration(timeout), // Timeout after 5 seconds
}
req, err := http.NewRequest(http.MethodGet, url, nil)
if err != nil {
log.Fatal(err)
}
req.Header.Set("accept", "application/json")
res, getErr := geoClient.Do(req)
if getErr != nil {
log.Fatal(getErr)
}
if res.Body != nil {
defer res.Body.Close()
}
body, readErr := ioutil.ReadAll(res.Body)
if readErr != nil {
log.Fatal(readErr)
}
jsonout := jsonOut{}
jsonErr := json.Unmarshal(body, &jsonout)
if jsonErr != nil {
log.Fatal(jsonErr)
}
file, _ := json.Marshal(jsonout)
write2file(outFile, file)
}
To Write data to file:
func write2file(outFile string, file []byte) {
f, err := os.OpenFile(outFile, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0600)
if err != nil {
log.Fatal(err)
}
defer f.Close()
if _, err = f.WriteString(string(file)); err != nil {
log.Fatal(err)
}
if _, err = f.WriteString("\n"); err != nil {
log.Fatal(err)
}
I know, i can edit f.WriteString("\n"); to f.WriteString(","); to add comma but still adding [] in the file is challenging for me.
First, please do not invent a new way of json marshaling, just use golang built-in encoding/json or other library on github.
Second, if you want to create a json string that represents an array of object, you need to create the array of objects in golang and marshal it into string (or more precisely, into array of bytes)
I create a simple as below, but please DIY if possible.
https://go.dev/play/p/RR_ok-fUTb_4

Changing the last character of a file

I want to continuously write json objects to a file. To be able to read it, I need to wrap them into an array. I don't want to read the whole file, for simple appending. So what I' doing now:
comma := []byte(", ")
file, err := os.OpenFile(erp.TransactionsPath, os.O_WRONLY|os.O_APPEND|os.O_CREATE, 0666)
if err != nil {
return err
}
transaction, err := json.Marshal(t)
if err != nil {
return err
}
transaction = append(transaction, comma...)
file.Write(transaction)
But with this implementation I will need to add []scopes by hand(or via some script) before reading. How can I add an object before closing scope on each writing?
You don't need to wrap the JSON objects into an array, you can just write them as-is. You may use json.Encoder to write them to the file, and you may use json.Decoder to read them. Encoder.Encode() and Decoder.Decode() encode and decode individual JSON values from a stream.
To prove it works, see this simple example:
const src = `{"id":"1"}{"id":"2"}{"id":"3"}`
dec := json.NewDecoder(strings.NewReader(src))
for {
var m map[string]interface{}
if err := dec.Decode(&m); err != nil {
if err == io.EOF {
break
}
panic(err)
}
fmt.Println("Read:", m)
}
It outputs (try it on the Go Playground):
Read: map[id:1]
Read: map[id:2]
Read: map[id:3]
When writing to / reading from a file, pass the os.File to json.NewEncoder() and json.NewDecoder().
Here's a complete demo which creates a temporary file, uses json.Encoder to write JSON objects into it, then reads them back with json.Decoder:
objs := []map[string]interface{}{
map[string]interface{}{"id": "1"},
map[string]interface{}{"id": "2"},
map[string]interface{}{"id": "3"},
}
file, err := ioutil.TempFile("", "test.json")
if err != nil {
panic(err)
}
// Writing to file:
enc := json.NewEncoder(file)
for _, obj := range objs {
if err := enc.Encode(obj); err != nil {
panic(err)
}
}
// Debug: print file's content
fmt.Println("File content:")
if data, err := ioutil.ReadFile(file.Name()); err != nil {
panic(err)
} else {
fmt.Println(string(data))
}
// Reading from file:
if _, err := file.Seek(0, io.SeekStart); err != nil {
panic(err)
}
dec := json.NewDecoder(file)
for {
var obj map[string]interface{}
if err := dec.Decode(&obj); err != nil {
if err == io.EOF {
break
}
panic(err)
}
fmt.Println("Read:", obj)
}
It outputs (try it on the Go Playground):
File content:
{"id":"1"}
{"id":"2"}
{"id":"3"}
Read: map[id:1]
Read: map[id:2]
Read: map[id:3]

detect duplicate in JSON String Golang

I have JSON string like
"{\"a\": \"b\", \"a\":true,\"c\":[\"field_3 string 1\",\"field3 string2\"]}"
how to detect the duplicate attribute in this json string using Golang
Use the json.Decoder to walk through the JSON. When an object is found, walk through keys and values checking for duplicate keys.
func check(d *json.Decoder, path []string, dup func(path []string) error) error {
// Get next token from JSON
t, err := d.Token()
if err != nil {
return err
}
// Is it a delimiter?
delim, ok := t.(json.Delim)
// No, nothing more to check.
if !ok {
// scaler type, nothing to do
return nil
}
switch delim {
case '{':
keys := make(map[string]bool)
for d.More() {
// Get field key.
t, err := d.Token()
if err != nil {
return err
}
key := t.(string)
// Check for duplicates.
if keys[key] {
// Duplicate found. Call the application's dup function. The
// function can record the duplicate or return an error to stop
// the walk through the document.
if err := dup(append(path, key)); err != nil {
return err
}
}
keys[key] = true
// Check value.
if err := check(d, append(path, key), dup); err != nil {
return err
}
}
// consume trailing }
if _, err := d.Token(); err != nil {
return err
}
case '[':
i := 0
for d.More() {
if err := check(d, append(path, strconv.Itoa(i)), dup); err != nil {
return err
}
i++
}
// consume trailing ]
if _, err := d.Token(); err != nil {
return err
}
}
return nil
}
Here's how to call it:
func printDup(path []string) error {
fmt.Printf("Duplicate %s\n", strings.Join(path, "/"))
return nil
}
...
data := `{"a": "b", "a":true,"c":["field_3 string 1","field3 string2"], "d": {"e": 1, "e": 2}}`
if err := check(json.NewDecoder(strings.NewReader(data)), nil, printDup); err != nil {
log.Fatal(err)
}
The output is:
Duplicate a
Duplicate d/e
Run it on the Playground
Here's how to generate an error on the first duplicate key:
var ErrDuplicate = errors.New("duplicate")
func dupErr(path []string) error {
return ErrDuplicate
}
...
data := `{"a": "b", "a":true,"c":["field_3 string 1","field3 string2"], "d": {"e": 1, "e": 2}}`
err := check(json.NewDecoder(strings.NewReader(data)), nil, dupErr)
if err == ErrDuplicate {
fmt.Println("found a duplicate")
} else if err != nil {
// some other error
log.Fatal(err)
}
One that would probably work well would be to simply decode, reencode, then check the length of the new json against the old json:
https://play.golang.org/p/50P-x1fxCzp
package main
import (
"encoding/json"
"fmt"
)
func main() {
jsn := []byte("{\"a\": \"b\", \"a\":true,\"c\":[\"field_3 string 1\",\"field3 string2\"]}")
var m map[string]interface{}
err := json.Unmarshal(jsn, &m)
if err != nil {
panic(err)
}
l := len(jsn)
jsn, err = json.Marshal(m)
if err != nil {
panic(err)
}
if l != len(jsn) {
panic(fmt.Sprintf("%s: %d (%d)", "duplicate key", l, len(jsn)))
}
}
The right way to do it would be to re-implement the json.Decode function, and store a map of keys found, but the above should work (especially if you first stripped any spaces from the json using jsn = bytes.Replace(jsn, []byte(" "), []byte(""), -1) to guard against false positives.

Best way to parse problematic JSON files in Golang

I have some valid JSON files and some which are not (without the surrounding brackets)
Currently I have a method for each case: one uses json.Unmarshal for the valid ones and the other uses json.NewDecoder for the bracketless ones.
How can I merge it into one function what can handle both cases?
EDIT:
Here is the code of the two cases:
func getDrivers() []Drivers {
raw, err := ioutil.ReadFile("/home/ubuntu/drivers.json")
if err != nil {
fmt.Println(err.Error())
os.Exit(1)
}
var d []Drivers
json.Unmarshal(raw, &d)
return d
}
func getMetrics() []Metrics {
file, err := os.Open("/home/ubuntu/metrics.json")
if err != nil {
fmt.Println("bad err!")
}
r := bufio.NewReader(file)
dec := json.NewDecoder(r)
// while the array contains values
var metrics []Metrics
for dec.More() {
var m Metrics
err := dec.Decode(&m)
if err != nil {
log.Fatal(err)
}
metrics = append(metrics, m)
}
return metrics
}
Thank you

Efficient read and write CSV in Go

The Go code below reads in a 10,000 record CSV (of timestamp times and float values), runs some operations on the data, and then writes the original values to another CSV along with an additional column for score. However it is terribly slow (i.e. hours, but most of that is calculateStuff()) and I'm curious if there are any inefficiencies in the CSV reading/writing I can take care of.
package main
import (
"encoding/csv"
"log"
"os"
"strconv"
)
func ReadCSV(filepath string) ([][]string, error) {
csvfile, err := os.Open(filepath)
if err != nil {
return nil, err
}
defer csvfile.Close()
reader := csv.NewReader(csvfile)
fields, err := reader.ReadAll()
return fields, nil
}
func main() {
// load data csv
records, err := ReadCSV("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
// write results to a new csv
outfile, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
defer outfile.Close()
writer := csv.NewWriter(outfile)
for i, record := range records {
time := record[0]
value := record[1]
// skip header row
if i == 0 {
writer.Write([]string{time, value, "score"})
continue
}
// get float values
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record: %v, Error: %v", floatValue, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
valueString := strconv.FormatFloat(floatValue, 'f', 8, 64)
scoreString := strconv.FormatFloat(prob, 'f', 8, 64)
//fmt.Printf("Result: %v\n", []string{time, valueString, scoreString})
writer.Write([]string{time, valueString, scoreString})
}
writer.Flush()
}
I'm looking for help making this CSV read/write template code as fast as possible. For the scope of this question we need not worry about the calculateStuff method.
You're loading the file in memory first then processing it, that can be slow with a big file.
You need to loop and call .Read and process one line at a time.
func processCSV(rc io.Reader) (ch chan []string) {
ch = make(chan []string, 10)
go func() {
r := csv.NewReader(rc)
if _, err := r.Read(); err != nil { //read header
log.Fatal(err)
}
defer close(ch)
for {
rec, err := r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
ch <- rec
}
}()
return
}
playground
//note it's roughly based on DaveC's comment.
This is essentially Dave C's answer from the comments sections:
package main
import (
"encoding/csv"
"log"
"os"
"strconv"
)
func main() {
// setup reader
csvIn, err := os.Open("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
r := csv.NewReader(csvIn)
// setup writer
csvOut, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
w := csv.NewWriter(csvOut)
defer csvOut.Close()
// handle header
rec, err := r.Read()
if err != nil {
log.Fatal(err)
}
rec = append(rec, "score")
if err = w.Write(rec); err != nil {
log.Fatal(err)
}
for {
rec, err = r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
// get float value
value := rec[1]
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record, error: %v, %v", value, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
scoreString := strconv.FormatFloat(score, 'f', 8, 64)
rec = append(rec, scoreString)
if err = w.Write(rec); err != nil {
log.Fatal(err)
}
w.Flush()
}
}
Note of course the logic is all jammed into main(), better would be to split it into several functions, but that's beyond the scope of this question.
encoding/csv is indeed very slow on big files, as it performs a lot of allocations. Since your format is so simple I recommend using strings.Split instead which is much faster.
If even that is not fast enough you can consider implementing the parsing yourself using strings.IndexByte which is implemented in assembly: http://golang.org/src/strings/strings_decl.go?s=274:310#L1
Having said that, you should also reconsider using ReadAll if the file is larger than your memory.