Why call rows.Close() takes too long time when I call it after I exit from loop rows.Next() before processing all elements of loop.
Its happen when I make request which returns huge amount of data (around 300 000 rows).
This problem doesn't exists when amount of rows is not so big.
func SelectHugeAmountOfRows() {
query := `SELECT * FROM big_table`
rows, _ := Conn.Query(context.Background(), query)
defer func() {
fmt.Println("Start close rows")
start := time.Now()
rows.Close()
duration := time.Since(start)
fmt.Println("rowsClose duration:", duration)
}()
for rows.Next() {
rowValues, err := rows.Values()
if err != nil {
fmt.Println(err)
return
}
// do something with rowValues and get error in process
err = func() error {
fmt.Println(rowValues)
return errors.New("some error")
}()
if err != nil {
return
}
}
if err := rows.Err(); err != nil {
fmt.Println(err)
return
}
}
rowsClose duration: 1m2.5488669s
Interesting that duration of rows.Close() in this case in same as duration which I will have if will process all elements of rows.Next() loop without break it.
Related
Usually, result, err := func() is used.
When one of the variables is already initialized:
_, err := func()
var result string
result, err = func()
Doing:
result, err = func()
all_results += result // seems redundant and unneeded
How do you append results to one of them (result), and reset the other one?
// along the lines of this:
var result slice
// for loop {
result, _ += func() // combine this line
_, err = func() // with this line
Can you do:
result +=, err = func()
// or
result, err +=, = func()
// or
result, err += = func()
// or
result, err (+=, =) func() // ?
The language spec does not support different treatment for multiple return values.
However, it's very easy to do it with a helper function:
func foo() (int, error) {
return 1, nil
}
func main() {
var all int
add := func(result int, err error) error {
all += result
return err
}
if err := add(foo()); err != nil {
panic(err)
}
if err := add(foo()); err != nil {
panic(err)
}
if err := add(foo()); err != nil {
panic(err)
}
fmt.Println(all)
}
This will output 3 (try it on the Go Playground).
If you can move the error handling into the helper function, it can also look like this:
var all int
check := func(result int, err error) int {
if err != nil {
panic(err)
}
return result
}
all += check(foo())
all += check(foo())
all += check(foo())
fmt.Println(all)
This outputs the same, try this one on the Go Playground.
Another variant can be to do everything in the helper function:
var all int
handle := func(result int, err error) {
if err != nil {
panic(err)
}
all += result
}
handle(foo())
handle(foo())
handle(foo())
fmt.Println(all)
Try this one on the Go Playground.
See related: Multiple values in single-value context
So... I'm finally doing my side project containing the super nerdy tabletop game Warhammer in which I've created a database MySQL and my next step is to create a API.
I've got three tables at this moment... "tyranids", "greyknghts" and "deathguard". I want to make a dynamic query to select targeted table. I'm able to do this but as the tables grown I need to make this dynamic.
func getTyranids(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-type", "application/json")
var units []Unit
result, err := db.Query("SELECT * FROM tyranids")
if err != nil {
panic(err.Error)
}
defer result.Close()
for result.Next() {
var unit Unit
err := result.Scan(&unit.ID, &unit.Name, &unit.Type, &unit.Movement, &unit.WeaponsSkill, &unit.BallisticSkill, &unit.Strength, &unit.Toughness, &unit.Wounds, &unit.Attacks, &unit.Leadership, &unit.Initiate, &unit.Points)
if err != nil {
panic(err.Error)
}
units = append(units, unit)
}
json.NewEncoder(w).Encode(units)
}
How can I write this so I won't need a function for each table?
I've made it work with mux.Vars for each individual unit.
func getTyranidUnit(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
params := mux.Vars(r)
result, err := db.Query("SELECT * FROM tyranids WHERE name = ?", params["name"])
if err != nil {
panic(err.Error())
}
defer result.Close()
var unit Unit
for result.Next() {
err := result.Scan(&unit.ID, &unit.Name, &unit.Type, &unit.Movement, &unit.WeaponsSkill, &unit.BallisticSkill, &unit.Strength, &unit.Toughness, &unit.Wounds, &unit.Attacks, &unit.Leadership, &unit.Initiate, &unit.Points)
if err != nil {
panic(err.Error())
}
}
json.NewEncoder(w).Encode(unit)
}
func main() {
db, err = sql.Open("mysql", "xx:xx#tcp(xxx)/Warhammer")
if err != nil {
panic(err.Error())
}
defer db.Close()
router := mux.NewRouter().StrictSlash(true)
router.HandleFunc("/tyranids", getTyranids).Methods("GET")
router.HandleFunc("/tyranids/{name}", getTyranidUnit).Methods("GET")
http.ListenAndServe(":8001", router)
}
Thank you.
You could just make a map to pair the tables to keys:
speciesUnitMap:= map[string]string{
"Norn-Queen": "tyranids",
"Hive Tyrant": "tyranids",
"Rippers ": "tyranids",
"Hive Ship": "tyranids",
"Ork Boyz": "orcs",
"Waaagh!": "orcs",
"Warboss": "orcs",
"Blood Axes" "orcs",
}
// This is dirty I would construct the string before
// and do verifications that the received param is conform
result, err := db.Query("SELECT * FROM " +
speciesUnitMap[params["name"]] + " WHERE name = ?", params["name"])
I'm working on a program that makes a query to MySQL, then for each row, changes something with that row and then update the row.
The problem is that sometimes when performing an update I get a deadlock, I'm not sure if it's because the query isn't releasing the lock by the time I update or if it's something else.
Example of what I'm doing:
const (
selectQuery = `select user_id, original_transaction_id, max(payment_id) as max_payment_id from Payment_Receipt
where auto_renew_status = 1 group by user_id, original_transaction_id having count(*) > 1`
updateQuery = `update Payment_Receipt set auto_renew_status = 0, changed_by = "payment_receipt_condenser",
changed_time = ? where user_id = ? and original_transaction_id = ? and payment_id != ? and auto_renew_status = 1`
)
mysql.go:
func New(db *sql.DB, driver string) (database.Database, error) {
sqlDB := sqlx.NewDb(db, driver)
if err := db.Ping(); err != nil {
return nil, errors.Wrap(err, "connecting to database")
}
selectStmt, err := sqlDB.Preparex(selectQuery)
if err != nil {
return nil, errors.Wrap(err, "preparing select query")
}
updateStmt, err := sqlDB.Preparex(updateQuery)
if err != nil {
return nil, errors.Wrap(err, "preparing update query")
}
return &mysql{
db: sqlDB,
selectStmt: selectStmt,
updateStmt: updateStmt,
}, nil
}
func (m *mysql) Query() (<- chan *database.Row, error) {
rowsChan := make(chan *database.Row)
rows, err := m.selectStmt.Queryx()
if err != nil {
return nil, errors.Wrap(err, "making query")
}
go func() {
defer rows.Close()
defer close(rowsChan)
for rows.Next() {
row := &database.Row{}
if err := rows.StructScan(row); err != nil {
log.WithError(err).WithField("user_id", row.UserID.Int32).Error("scanning row")
}
// change some of the data here
// and put into channel for worker to consume
rowsChan <- row
}
}()
return rowsChan, nil
}
func (m *mysql) Update(row *database.Row) error {
tx, err := m.db.Beginx()
if err != nil {
return errors.Wrap(err, "beginning transaction")
}
if _, err := tx.Stmtx(m.updateStmt).Exec(row.ChangedTime); err != nil {
return errors.Wrap(err, "executing update")
}
if err := tx.Commit(); err != nil {
return errors.Wrap(err, "committing transaction")
}
return nil
}
worker.go
func (w *worker) Run(wg *sync.WaitGroup) {
rowsChan, err := w.db.Query()
if err != nil {
log.WithError(err).Fatal("failed making query")
}
for i := 0; i < w.config.Count(); i++ {
wg.Add(1)
go func() {
defer wg.Done()
for row := range rowsChan {
if err := w.db.Update(row); err != nil {
log.WithError(err).WithField("user_id", row.UserID.Int32).Error("updating row")
}
}
}()
}
}
You could make the results (row) channel from a Query() buffered:
func (m *mysql) Query() (<- chan *database.Row, error) {
rowsChan := make(chan *database.Row, 1000) // <- band-aid fix
// ...
}
This will ensure that the row collector function can write multiple results without waiting for your worker go-routine to read the results. The query operation will complete (provided there are 1000 rows or less), and the update go-routine operations can begin their parallel work.
If this fixes things, then consider putting say an SQL limit on your queries (e.g. LIMIT 1000) to ensure you don't hit deadlock again (if 1000+ records is a real possibility).
Crafting "pagination" style queries to grab the next say 1000 rows, using RowID markers etc. to ensure full coverage of results - all while avoiding locking out any of your update operations.
When I using sql package of golang, if I make a query within transaction, and encounter an error while calling rows.Scan(), which method should I call first after this point? *sql.Tx.Rollback() or *sql.Rows.Close()? Currently I call *sql.Rows.Close() before *sql.Tx.Rollback(), but I want to know, what will happen if I reverse this order?
tx, err := db.Begin()
if err != nil {
... // handle error
}
rows, err := tx.Query("sqlstmt")
if err != nil {
... // handle error
}
defer rows.Close() // can I use defer at this place, though it will be called after tx.Rollback()?
if err := rows.Scan(vars...); err != nil {
if e := tx.Rollback(); e != nil {
log(e)
return e
}
return err
}
https://go-review.googlesource.com/c/go/+/44812/
The code is here
It doesn't matter even if skip the rows.Close() within transaction
When the transaction has commit or rollback, the rows will be closed by transaction context.
The Go code below reads in a 10,000 record CSV (of timestamp times and float values), runs some operations on the data, and then writes the original values to another CSV along with an additional column for score. However it is terribly slow (i.e. hours, but most of that is calculateStuff()) and I'm curious if there are any inefficiencies in the CSV reading/writing I can take care of.
package main
import (
"encoding/csv"
"log"
"os"
"strconv"
)
func ReadCSV(filepath string) ([][]string, error) {
csvfile, err := os.Open(filepath)
if err != nil {
return nil, err
}
defer csvfile.Close()
reader := csv.NewReader(csvfile)
fields, err := reader.ReadAll()
return fields, nil
}
func main() {
// load data csv
records, err := ReadCSV("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
// write results to a new csv
outfile, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
defer outfile.Close()
writer := csv.NewWriter(outfile)
for i, record := range records {
time := record[0]
value := record[1]
// skip header row
if i == 0 {
writer.Write([]string{time, value, "score"})
continue
}
// get float values
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record: %v, Error: %v", floatValue, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
valueString := strconv.FormatFloat(floatValue, 'f', 8, 64)
scoreString := strconv.FormatFloat(prob, 'f', 8, 64)
//fmt.Printf("Result: %v\n", []string{time, valueString, scoreString})
writer.Write([]string{time, valueString, scoreString})
}
writer.Flush()
}
I'm looking for help making this CSV read/write template code as fast as possible. For the scope of this question we need not worry about the calculateStuff method.
You're loading the file in memory first then processing it, that can be slow with a big file.
You need to loop and call .Read and process one line at a time.
func processCSV(rc io.Reader) (ch chan []string) {
ch = make(chan []string, 10)
go func() {
r := csv.NewReader(rc)
if _, err := r.Read(); err != nil { //read header
log.Fatal(err)
}
defer close(ch)
for {
rec, err := r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
ch <- rec
}
}()
return
}
playground
//note it's roughly based on DaveC's comment.
This is essentially Dave C's answer from the comments sections:
package main
import (
"encoding/csv"
"log"
"os"
"strconv"
)
func main() {
// setup reader
csvIn, err := os.Open("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
r := csv.NewReader(csvIn)
// setup writer
csvOut, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
w := csv.NewWriter(csvOut)
defer csvOut.Close()
// handle header
rec, err := r.Read()
if err != nil {
log.Fatal(err)
}
rec = append(rec, "score")
if err = w.Write(rec); err != nil {
log.Fatal(err)
}
for {
rec, err = r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
// get float value
value := rec[1]
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record, error: %v, %v", value, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
scoreString := strconv.FormatFloat(score, 'f', 8, 64)
rec = append(rec, scoreString)
if err = w.Write(rec); err != nil {
log.Fatal(err)
}
w.Flush()
}
}
Note of course the logic is all jammed into main(), better would be to split it into several functions, but that's beyond the scope of this question.
encoding/csv is indeed very slow on big files, as it performs a lot of allocations. Since your format is so simple I recommend using strings.Split instead which is much faster.
If even that is not fast enough you can consider implementing the parsing yourself using strings.IndexByte which is implemented in assembly: http://golang.org/src/strings/strings_decl.go?s=274:310#L1
Having said that, you should also reconsider using ReadAll if the file is larger than your memory.