SQL result to JSON as fast as possible

SQL result to JSON as fast as possible - json

I'm trying to transform the Go built-in sql result to JSON. I'm using goroutines for that but I got problems.
The base problem:
There is a really big database with around 200k user and I have to serve them through tcp sockets in a microservice based system. To get the users from the database spent 20ms but transform this bunch of data to JSON spend 10 seconds with the current solution. This is why I want to use goroutines.
Solution with Goroutines:
func getJSON(rows *sql.Rows, cnf configure.Config) ([]byte, error) {
log := logan.Log{
Cnf: cnf,
}
cols, _ := rows.Columns()
defer rows.Close()
done := make(chan struct{})
go func() {
defer close(done)
for result := range resultChannel {
results = append(
results,
result,
)
}
}()
wg.Add(1)
go func() {
for rows.Next() {
wg.Add(1)
go handleSQLRow(cols, rows)
}
wg.Done()
}()
go func() {
wg.Wait()
defer close(resultChannel)
}()
<-done
s, err := json.Marshal(results)
results = []resultContainer{}
if err != nil {
log.Context(1).Error(err)
}
rows.Close()
return s, nil
}
func handleSQLRow(cols []string, rows *sql.Rows) {
defer wg.Done()
result := make(map[string]string, len(cols))
fmt.Println("asd -> " + strconv.Itoa(counter))
counter++
rawResult := make([][]byte, len(cols))
dest := make([]interface{}, len(cols))
for i := range rawResult {
dest[i] = &rawResult[i]
}
rows.Scan(dest...) // GET PANIC
for i, raw := range rawResult {
if raw == nil {
result[cols[i]] = ""
} else {
fmt.Println(string(raw))
result[cols[i]] = string(raw)
}
}
resultChannel <- result
}
This solution give me a panic with the following message:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x45974c]
goroutine 408 [running]:
panic(0x7ca140, 0xc420010150)
/usr/lib/golang/src/runtime/panic.go:500 +0x1a1
database/sql.convertAssign(0x793960, 0xc420529210, 0x7a5240, 0x0, 0x0, 0x0)
/usr/lib/golang/src/database/sql/convert.go:88 +0x1ef1
database/sql.(*Rows).Scan(0xc4203e4060, 0xc42021fb00, 0x44, 0x44, 0x44, 0x44)
/usr/lib/golang/src/database/sql/sql.go:1850 +0xc2
github.com/PumpkinSeed/zerodb/operations.handleSQLRow(0xc420402000, 0x44, 0x44, 0xc4203e4060)
/home/loow/gopath/src/github.com/PumpkinSeed/zerodb/operations/operations.go:290 +0x19c
created by github.com/PumpkinSeed/zerodb/operations.getJSON.func2
/home/loow/gopath/src/github.com/PumpkinSeed/zerodb/operations/operations.go:258 +0x91
exit status 2
The current solution which is working but spend too much time:
func getJSON(rows *sql.Rows, cnf configure.Config) ([]byte, error) {
log := logan.Log{
Cnf: cnf,
}
var results []resultContainer
cols, _ := rows.Columns()
rawResult := make([][]byte, len(cols))
dest := make([]interface{}, len(cols))
for i := range rawResult {
dest[i] = &rawResult[i]
}
defer rows.Close()
for rows.Next() {
result := make(map[string]string, len(cols))
rows.Scan(dest...)
for i, raw := range rawResult {
if raw == nil {
result[cols[i]] = ""
} else {
result[cols[i]] = string(raw)
}
}
results = append(results, result)
}
s, err := json.Marshal(results)
if err != nil {
log.Context(1).Error(err)
}
rows.Close()
return s, nil
}
Question:
Why the goroutine solution give me an error, where it is not an obvious panic, because the first ~200 goroutine running properly?!
UPDATE
Performance test for the original working solution:
INFO[0020] setup taken -> 3.149124658s file=operations.go func=operations.getJSON line=260 service="Database manager" ts="2017-04-02 19:45:27.132881211 +0100 BST"
INFO[0025] toJSON taken -> 5.317647046s file=operations.go func=operations.getJSON line=263 service="Database manager" ts="2017-04-02 19:45:32.450551417 +0100 BST"
The sql to map is 3 sec and to json is 5 sec.

Go routines won't improve performance on CPU-bound operations like JSON marshaling. What you need is a more efficient JSON marshaler. There are some available, although I haven't used any. A simple Google search for 'faster JSON marshaling' will turn up many results. A popular one is ffjson. I suggest starting there.

Related

MySQL query sometimes deadlocks

I'm working on a program that makes a query to MySQL, then for each row, changes something with that row and then update the row.
The problem is that sometimes when performing an update I get a deadlock, I'm not sure if it's because the query isn't releasing the lock by the time I update or if it's something else.
Example of what I'm doing:
const (
selectQuery = `select user_id, original_transaction_id, max(payment_id) as max_payment_id from Payment_Receipt
where auto_renew_status = 1 group by user_id, original_transaction_id having count(*) > 1`
updateQuery = `update Payment_Receipt set auto_renew_status = 0, changed_by = "payment_receipt_condenser",
changed_time = ? where user_id = ? and original_transaction_id = ? and payment_id != ? and auto_renew_status = 1`
)
mysql.go:
func New(db *sql.DB, driver string) (database.Database, error) {
sqlDB := sqlx.NewDb(db, driver)
if err := db.Ping(); err != nil {
return nil, errors.Wrap(err, "connecting to database")
}
selectStmt, err := sqlDB.Preparex(selectQuery)
if err != nil {
return nil, errors.Wrap(err, "preparing select query")
}
updateStmt, err := sqlDB.Preparex(updateQuery)
if err != nil {
return nil, errors.Wrap(err, "preparing update query")
}
return &mysql{
db: sqlDB,
selectStmt: selectStmt,
updateStmt: updateStmt,
}, nil
}
func (m *mysql) Query() (<- chan *database.Row, error) {
rowsChan := make(chan *database.Row)
rows, err := m.selectStmt.Queryx()
if err != nil {
return nil, errors.Wrap(err, "making query")
}
go func() {
defer rows.Close()
defer close(rowsChan)
for rows.Next() {
row := &database.Row{}
if err := rows.StructScan(row); err != nil {
log.WithError(err).WithField("user_id", row.UserID.Int32).Error("scanning row")
}
// change some of the data here
// and put into channel for worker to consume
rowsChan <- row
}
}()
return rowsChan, nil
}
func (m *mysql) Update(row *database.Row) error {
tx, err := m.db.Beginx()
if err != nil {
return errors.Wrap(err, "beginning transaction")
}
if _, err := tx.Stmtx(m.updateStmt).Exec(row.ChangedTime); err != nil {
return errors.Wrap(err, "executing update")
}
if err := tx.Commit(); err != nil {
return errors.Wrap(err, "committing transaction")
}
return nil
}
worker.go
func (w *worker) Run(wg *sync.WaitGroup) {
rowsChan, err := w.db.Query()
if err != nil {
log.WithError(err).Fatal("failed making query")
}
for i := 0; i < w.config.Count(); i++ {
wg.Add(1)
go func() {
defer wg.Done()
for row := range rowsChan {
if err := w.db.Update(row); err != nil {
log.WithError(err).WithField("user_id", row.UserID.Int32).Error("updating row")
}
}
}()
}
}

You could make the results (row) channel from a Query() buffered:
func (m *mysql) Query() (<- chan *database.Row, error) {
rowsChan := make(chan *database.Row, 1000) // <- band-aid fix
// ...
}
This will ensure that the row collector function can write multiple results without waiting for your worker go-routine to read the results. The query operation will complete (provided there are 1000 rows or less), and the update go-routine operations can begin their parallel work.
If this fixes things, then consider putting say an SQL limit on your queries (e.g. LIMIT 1000) to ensure you don't hit deadlock again (if 1000+ records is a real possibility).
Crafting "pagination" style queries to grab the next say 1000 rows, using RowID markers etc. to ensure full coverage of results - all while avoiding locking out any of your update operations.

Parsing JSON concurrently - panic of runtime error (decoding related)

I was playing with go recently and stuck with a runtime error, I can't explain. These are my working functions.
type User struct {
Browsers []string `json:"browsers"`
Name string `json:"name"`
Email string `json:"email"`
}
func asyncUserProcJson(wg *sync.WaitGroup, users *[]User, ch chan []byte) {
for buf := range ch {
var mu sync.Mutex
var user User
mu.Lock()
err := json.Unmarshal(buf, &user)
mu.Unlock()
if err != nil {
fmt.Println("json:", err)
wg.Done()
continue
}
*users = append(*users, user)
wg.Done()
}
}
func userProcJson(buf []byte) (User, error) {
var user User
err := json.Unmarshal(buf, &user)
if err != nil {
return User{}, err
}
return user, nil
}
If I do a common - non-concurrent aproach, its works as expected. But if, try to use channel to pass bytes to goroutine... it fails.
type AsyncUserProc func(*sync.WaitGroup, *[]User, chan []byte)
type UserProc func(buf []byte) (User, error)
type SearchParams struct {
out io.Writer
asyncUserProc AsyncUserProc
userProc UserProc
}
func (sp SearchParams) AsyncSearch() []User {
file, err := os.Open(filePath)
if err != nil {
log.Fatalln(err)
}
var Users = make([]User, 0, 1024)
var ch = make(chan []byte)
var wg sync.WaitGroup
go sp.asyncUserProcess(&wg, &Users, ch)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
wg.Add(1)
ch <- scanner.Bytes()
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading standard input:", err)
}
close(ch)
wg.Wait()
return Users
}
func (sp SearchParams) Search() []User {
file, err := os.Open(filePath)
if err != nil {
log.Fatalln(err)
}
// json processor
var Users = make([]User, 0, 1024)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
u, err := sp.userProcess(scanner.Bytes())
if err != nil {
log.Panicln(err)
continue
}
Users = append(Users, u)
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading standard input:", err)
}
return Users
}
Workflow is the next one:
filePath contains a JSON chunks (each on new line)
Open for reading.
Create a line scanner
(AsyncSearch)
Pass line to channel.
return value of the line from range (blocking operation)
pass to json.Unmarshal
troubles
(Search)
Pass line directly to userProc func
Enjoy result
I am getting a lot (different) errors.
a lot of json unmarshaling error.
index out of range
JSON decoder out of sync - data changing underfoot?
as description of last error:
// phasePanicMsg is used as a panic message when we end up with something that
// shouldn't happen. It can indicate a bug in the JSON decoder, or that
// something is editing the data slice while the decoder executes.
So here is a question: How the bytes slice is modified?
I thought it was blocking operation. What am I missing in language mechanics?
Example of the errors (different each run)
json: invalid character 'i' looking for beginning of value
json: invalid character ':' after top-level value
json: invalid character 'r' looking for beginning of value
panic: runtime error: index out of range
----
json: invalid character '.' after top-level value
json: invalid character 'K' looking for beginning of value
panic: JSON decoder out of sync - data changing underfoot?

Package bufio
import "bufio"
func (*Scanner) Bytes
func (s *Scanner) Bytes() []byte
Bytes returns the most recent token generated by a call to Scan. The
underlying array may point to data that will be overwritten by a
subsequent call to Scan. It does no allocation.
The underlying array may point to data that will be overwritten by a subsequent call to Scan.

Chrome native messaging host in golang fails when JSON size is more than 65500 characters

I am trying to write a native messaging host for chrome in golang. For this purpose, I tried using chrome-go as well as chrome-native-messaging packages. Both presented with the same problem as explained below.
Here is the code. I have added the relevant parts from the chrome-go package to the main file instead of importing it for easy understanding.
The following code actually works when I send a json message to it like {content:"Apple Mango"}. However, it stops working once the length of the json goes over approximately 65500 characters, give or take a 100 characters. There is no error output either.
package main
import (
"encoding/binary"
"encoding/json"
"fmt"
"io"
"os"
)
var byteOrder binary.ByteOrder = binary.LittleEndian
func Receive(reader io.Reader) ([]byte, error) {
// Read message length in native byte order
var length uint32
if err := binary.Read(reader, byteOrder, &length); err != nil {
return nil, err
}
// Return if no message
if length == 0 {
return nil, nil
}
// Read message body
received := make([]byte, length)
if n, err := reader.Read(received); err != nil || n != len(received) {
return nil, err
}
return received, nil
}
type response struct {
Content string `json:"content"`
}
func main() {
msg, err := Receive(os.Stdin)
if err != nil {
panic(err)
}
var res response
err = json.Unmarshal([]byte(msg), &res)
if err != nil {
panic(err)
}
fmt.Println(res.Content)
}
For those interested in testing, I have set up a repository with instructions. Run the following
git clone --depth=1 https://tesseract-index#bitbucket.org/tesseract-index/chrome-native-messaging-test-riz.git && cd chrome-native-messaging-test-riz
./json2msg.js < test-working.json | go run main.go
./json2msg.js < test-not-working.json | go run main.go
You will see that test-not-working.json gives no output, although its difference with test-working.json is a few hundred characters only.
What is the issue here?

There is a limitation of a pipe buffer which varies across systems. Mac OS X, for example, uses a capacity of 16384 bytes by default.
You can use this bash script to check your buffer capacity:
M=0; while printf A; do >&2 printf "\r$((++M)) B"; done | sleep 999
So it is not related to go, because I tried to change your code to read from file and Unmarshal and it worked:
func main() {
reader, err := os.Open("test-not-working.json")
if err != nil {
panic(err)
}
var res response
decoder := json.NewDecoder(reader)
err = decoder.Decode(&res)
if err != nil {
panic(err)
}
fmt.Println(res.Content)
}

This is because the pipe buffer of your OS is limited to 65536 bytes. Thus, the os.Stdin.Read(...) function can read 65536 bytes at once.
You can fix your code with this simple replacement:
n, err := io.ReadFull(reader, received)
And there is your error:
msg, err := Receive(os.Stdin)
if err != nil {
panic(err)
}
You have compared err with nil, but you have not compared msg with nil. But since you have read 65532 (65536 - 4) bytes, the func Receive(...) returned nil, nil.
To fix this, your function Receive(...) ought not return nil, nil.

Undefined behaviour while loading a large CSV concurrently using Goroutines

I am trying to load a big CSV file using goroutines using Golang. The dimension of the csv is (254882, 100). But using my goroutines when I am parsing the csv and storing it into an 2D list, I am getting rows lesser than 254882 and the number is varying for each run. I feel it is happening due goroutines but can't seem to point the reason. Can anyone please help me. I am also new in Golang. Here is my code below
func loadCSV(csvFile string) (*[][]float64, error) {
startTime := time.Now()
var dataset [][]float64
f, err := os.Open(csvFile)
if err != nil {
return &dataset, err
}
r := csv.NewReader(bufio.NewReader(f))
counter := 0
var wg sync.WaitGroup
for {
record, err := r.Read()
if err == io.EOF {
break
}
if counter != 0 {
wg.Add(1)
go func(r []string, dataset *[][]float64) {
var temp []float64
for _, each := range record {
f, err := strconv.ParseFloat(each, 64)
if err == nil {
temp = append(temp, f)
}
}
*dataset = append(*dataset, temp)
wg.Done()
}(record, &dataset)
}
counter++
}
wg.Wait()
duration := time.Now().Sub(startTime)
log.Printf("Loaded %d rows in %v seconds", counter, duration)
return &dataset, nil
}
And my main function looks like the following
func main() {
// runtime.GOMAXPROCS(4)
dataset, err := loadCSV("AvgW2V_train.csv")
if err != nil {
panic(err)
}
fmt.Println(len(*dataset))
}
If anyone needs to download the CSV too, then click the link below (485 MB)
https://drive.google.com/file/d/1G4Nw6JyeC-i0R1exWp5BtRtGM1Fwyelm/view?usp=sharing

Go Data Race Detector
Your results are undefined because you have data races.
~/gopath/src$ go run -race racer.go
==================
WARNING: DATA RACE
Write at 0x00c00008a060 by goroutine 6:
runtime.mapassign_faststr()
/home/peter/go/src/runtime/map_faststr.go:202 +0x0
main.main.func2()
/home/peter/gopath/src/racer.go:16 +0x6a
Previous write at 0x00c00008a060 by goroutine 5:
runtime.mapassign_faststr()
/home/peter/go/src/runtime/map_faststr.go:202 +0x0
main.main.func1()
/home/peter/gopath/src/racer.go:11 +0x6a
Goroutine 6 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:14 +0x88
Goroutine 5 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:9 +0x5b
==================
fatal error: concurrent map writes
==================
WARNING: DATA RACE
Write at 0x00c00009a088 by goroutine 6:
main.main.func2()
/home/peter/gopath/src/racer.go:16 +0x7f
Previous write at 0x00c00009a088 by goroutine 5:
main.main.func1()
/home/peter/gopath/src/racer.go:11 +0x7f
Goroutine 6 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:14 +0x88
Goroutine 5 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:9 +0x5b
==================
goroutine 34 [running]:
runtime.throw(0x49e156, 0x15)
/home/peter/go/src/runtime/panic.go:608 +0x72 fp=0xc000094718 sp=0xc0000946e8 pc=0x44b342
runtime.mapassign_faststr(0x48ace0, 0xc00008a060, 0x49c9c3, 0x8, 0xc00009a088)
/home/peter/go/src/runtime/map_faststr.go:211 +0x46c fp=0xc000094790 sp=0xc000094718 pc=0x43598c
main.main.func1(0x49c9c3, 0x8)
/home/peter/gopath/src/racer.go:11 +0x6b fp=0xc0000947d0 sp=0xc000094790 pc=0x47ac6b
runtime.goexit()
/home/peter/go/src/runtime/asm_amd64.s:1340 +0x1 fp=0xc0000947d8 sp=0xc0000947d0 pc=0x473061
created by main.main
/home/peter/gopath/src/racer.go:9 +0x5c
goroutine 1 [sleep]:
time.Sleep(0x5f5e100)
/home/peter/go/src/runtime/time.go:105 +0x14a
main.main()
/home/peter/gopath/src/racer.go:19 +0x96
goroutine 35 [runnable]:
main.main.func2(0x49c9c3, 0x8)
/home/peter/gopath/src/racer.go:16 +0x6b
created by main.main
/home/peter/gopath/src/racer.go:14 +0x89
exit status 2
~/gopath/src$
racer.go:
package main
import (
"bufio"
"encoding/csv"
"fmt"
"io"
"log"
"os"
"strconv"
"sync"
"time"
)
func loadCSV(csvFile string) (*[][]float64, error) {
startTime := time.Now()
var dataset [][]float64
f, err := os.Open(csvFile)
if err != nil {
return &dataset, err
}
r := csv.NewReader(bufio.NewReader(f))
counter := 0
var wg sync.WaitGroup
for {
record, err := r.Read()
if err == io.EOF {
break
}
if counter != 0 {
wg.Add(1)
go func(r []string, dataset *[][]float64) {
var temp []float64
for _, each := range record {
f, err := strconv.ParseFloat(each, 64)
if err == nil {
temp = append(temp, f)
}
}
*dataset = append(*dataset, temp)
wg.Done()
}(record, &dataset)
}
counter++
}
wg.Wait()
duration := time.Now().Sub(startTime)
log.Printf("Loaded %d rows in %v seconds", counter, duration)
return &dataset, nil
}
func main() {
// runtime.GOMAXPROCS(4)
dataset, err := loadCSV("/home/peter/AvgW2V_train.csv")
if err != nil {
panic(err)
}
fmt.Println(len(*dataset))
}

There is no need to use *[][]float64 as that would be a double pointer.
I have made some minor modifications to your program.
dataset is available to new goroutine, since it's declared in it's above block of code.
Similarly record is also available, but since record variable, is changing from time to time, we need to pass it to new goroutine.
While there is no need to pass dataset, as it is not changing and that is what we want, so that we can append temp to dataset.
But race condition happens when multiple goroutines are trying to append to same variable, i.e., multiple goroutines are trying to write to same variable.
So we need to make sure that only one can goroutine can add at any instance of time.
So we use a lock to make appending sequential.
package main
import (
"bufio"
"encoding/csv"
"fmt"
"os"
"strconv"
"sync"
)
func loadCSV(csvFile string) [][]float64 {
var dataset [][]float64
f, _ := os.Open(csvFile)
r := csv.NewReader(f)
var wg sync.WaitGroup
l := new(sync.Mutex) // lock
for record, err := r.Read(); err == nil; record, err = r.Read() {
wg.Add(1)
go func(record []string) {
defer wg.Done()
var temp []float64
for _, each := range record {
if f, err := strconv.ParseFloat(each, 64); err == nil {
temp = append(temp, f)
}
}
l.Lock() // lock before writing
dataset = append(dataset, temp) // write
l.Unlock() // unlock
}(record)
}
wg.Wait()
return dataset
}
func main() {
dataset := loadCSV("train.csv")
fmt.Println(len(dataset))
}
Some errors were not handled to make it minimal, but you should handle errors.

Efficient read and write CSV in Go

The Go code below reads in a 10,000 record CSV (of timestamp times and float values), runs some operations on the data, and then writes the original values to another CSV along with an additional column for score. However it is terribly slow (i.e. hours, but most of that is calculateStuff()) and I'm curious if there are any inefficiencies in the CSV reading/writing I can take care of.
package main
import (
"encoding/csv"
"log"
"os"
"strconv"
)
func ReadCSV(filepath string) ([][]string, error) {
csvfile, err := os.Open(filepath)
if err != nil {
return nil, err
}
defer csvfile.Close()
reader := csv.NewReader(csvfile)
fields, err := reader.ReadAll()
return fields, nil
}
func main() {
// load data csv
records, err := ReadCSV("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
// write results to a new csv
outfile, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
defer outfile.Close()
writer := csv.NewWriter(outfile)
for i, record := range records {
time := record[0]
value := record[1]
// skip header row
if i == 0 {
writer.Write([]string{time, value, "score"})
continue
}
// get float values
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record: %v, Error: %v", floatValue, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
valueString := strconv.FormatFloat(floatValue, 'f', 8, 64)
scoreString := strconv.FormatFloat(prob, 'f', 8, 64)
//fmt.Printf("Result: %v\n", []string{time, valueString, scoreString})
writer.Write([]string{time, valueString, scoreString})
}
writer.Flush()
}
I'm looking for help making this CSV read/write template code as fast as possible. For the scope of this question we need not worry about the calculateStuff method.

You're loading the file in memory first then processing it, that can be slow with a big file.
You need to loop and call .Read and process one line at a time.
func processCSV(rc io.Reader) (ch chan []string) {
ch = make(chan []string, 10)
go func() {
r := csv.NewReader(rc)
if _, err := r.Read(); err != nil { //read header
log.Fatal(err)
}
defer close(ch)
for {
rec, err := r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
ch <- rec
}
}()
return
}
playground
//note it's roughly based on DaveC's comment.

This is essentially Dave C's answer from the comments sections:
package main
import (
"encoding/csv"
"log"
"os"
"strconv"
)
func main() {
// setup reader
csvIn, err := os.Open("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
r := csv.NewReader(csvIn)
// setup writer
csvOut, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
w := csv.NewWriter(csvOut)
defer csvOut.Close()
// handle header
rec, err := r.Read()
if err != nil {
log.Fatal(err)
}
rec = append(rec, "score")
if err = w.Write(rec); err != nil {
log.Fatal(err)
}
for {
rec, err = r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
// get float value
value := rec[1]
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record, error: %v, %v", value, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
scoreString := strconv.FormatFloat(score, 'f', 8, 64)
rec = append(rec, scoreString)
if err = w.Write(rec); err != nil {
log.Fatal(err)
}
w.Flush()
}
}
Note of course the logic is all jammed into main(), better would be to split it into several functions, but that's beyond the scope of this question.

encoding/csv is indeed very slow on big files, as it performs a lot of allocations. Since your format is so simple I recommend using strings.Split instead which is much faster.
If even that is not fast enough you can consider implementing the parsing yourself using strings.IndexByte which is implemented in assembly: http://golang.org/src/strings/strings_decl.go?s=274:310#L1
Having said that, you should also reconsider using ReadAll if the file is larger than your memory.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL result to JSON as fast as possible - json

Related

MySQL query sometimes deadlocks

Parsing JSON concurrently - panic of runtime error (decoding related)

Chrome native messaging host in golang fails when JSON size is more than 65500 characters

Undefined behaviour while loading a large CSV concurrently using Goroutines

Efficient read and write CSV in Go

Categories

Resources