I'm new in Go (Golang). I wrote a simple benchmark program to test the concurrent processing with MySQL. Keep getting "dial tcp 52.55.254.165:3306: getsockopt: connection refused", "unexpected EOF" errors when I increase the number of concurrent channels.
Each go routine is doing a batch insert of 1 to n number of row to a simple customer table. The program allows to set variable insert size (number of rows in a single statement) and number of parallel go routine (each go routine performs one insert above). Program works fine with small numbers row<100 and number go routines<100. But start getting Unexpected EOF errors when the numbers increase, especially the number of parallel go routines.
Did search for clues. Based on them, I've set the database max connection and 'max_allowed_packet' and 'max_connections'. I've also set the go program db.db.SetMaxOpenConns(200), db.SetConnMaxLifetime(200), db.SetMaxIdleConns(10). I've experimented with big numbers and small (from 10 to 2000). Nothing seems to solve the program.
I have one global db connection open. Code snippet below:
// main package
func main() {
var err error
db, err = sql.Open("mysql","usr:pwd#tcp(ip:3306)/gopoc")
if err != nil {
log.Panic(err)
}
db.SetMaxOpenConns(1000)
db.SetConnMaxLifetime(1000)
db.SetMaxIdleConns(10)
// sql.DB should be long lived "defer" closes it once this function ends
defer db.Close()
if err = db.Ping(); err != nil {
log.Panic(err)
}
http.HandleFunc("/addCust/", HFHandleFunc(addCustHandler))
http.ListenAndServe(":8080", nil)
}
// add customer handler
func addCustHandler(w http.ResponseWriter, r *http.Request) {
// experected url: /addCust/?num=3$pcnt=1
num, _ := strconv.Atoi(r.URL.Query().Get("num"))
pcnt, _ := strconv.Atoi(r.URL.Query().Get("pcnt"))
ch := make([]chan string, pcnt) // initialize channel slice
for i := range ch {
ch[i] = make(chan string, 1)
}
var wg sync.WaitGroup
for i, chans := range ch {
wg.Add(1)
go func(cha chan string, ii int) {
defer wg.Done()
addCust(num)
cha <- "Channel[" + strconv.Itoa(ii) + "]\n"
}(chans, i)
}
wg.Wait()
var outputstring string
for i := 0; i < pcnt; i++ {
outputstring = outputstring + <-ch[i]
}
fmt.Fprintf(w, "Output:\n%s", outputstring)
}
func addCust(cnt int) sql.Result {
...
sqlStr := "INSERT INTO CUST (idCUST, idFI, fName, state, country) VALUES "
for i := 0; i < cnt; i++ {
sqlStr += "(" + strconv.Itoa(FiIDpadding+r.Intn(CidMax)+1) + ", " + strconv.Itoa(FiID) +", 'fname', 'PA', 'USA), "
}
//trim the last ,
sqlStr = sqlStr[0:len(sqlStr)-2] + " on duplicate key update lname='dup';"
res, err := db.Exec(sqlStr)
if err != nil {
panic("\nInsert Statement error\n" + err.Error())
}
return res
}
I suppose you are calling sql.Open in each of your routine?
The Open function should be called just once. You should share your opened DB connection between your routines. The DB returned by the Open function can be used concurrently and has its own pool
Related
I have some code which inserts the records on the database:
The code is supposed to insert 15M records on the database, right now, it takes 60 hours on a AWS t2.large instance. I'm looking for ways to make the insert on the DB faster while also not duplicating records.
Do you guys have suggestions for me?
I'm using Gorm and MYSQL.
// InsertJob will insert job into database, by checking its hash.
func InsertJob(job XMLJob, oid int, ResourceID int) (Job, error) {
db := globalDBConnection
cleanJobDescription := job.Body
hashString := GetMD5Hash(job.Title + job.Body + job.Location + job.Zip)
JobDescriptionHash := GetMD5Hash(job.Body)
empty := sql.NullString{String: "", Valid: true}
j := Job{
CurrencyID: 1, //USD
//other fields here elided for brevity
PrimaryIndustry: sql.NullString{String: job.PrimaryIndustry, Valid: true},
}
err := db.Where("hash = ?", hashString).Find(&j).Error
if err != nil {
if err.Error() != "record not found" {
return j, err
}
err2 := db.Create(&j).Error
if err2 != nil {
log.Println("Unable to create job:" + err.Error())
return j, err2
}
}
return j, nil
}
You can speed it up using using semaphore pattern.
https://play.golang.org/p/OxO8pNy3bc6
inspired from here.
https://gist.github.com/montanaflynn/ea4b92ed640f790c4b9cee36046a5383
I'm doing something like this:
import(
"database/sql"
"github.com/go-sql-driver/mysql"
)
var db *sql.DB
func main() {
var err error
db, err = sql.Open(...)
if err != nil {
panic(err)
}
for j := 0; j < 8000; j++ {
_, err := db.Query("QUERY...")
if err != nil {
logger.Println("Error " + err.Error())
return
}
}
}
It works for the first 150 queries (for that I'm using another function to make) but after that, I get the error :
mysqli_real_connect(): (HY000/1040): Too many connections
So clearly I'm doing something wrong but I can't find what is it. I don't know what to open and close a new connection for each query.
Error in the log file :
"reg: 2020/06/28 03:35:34 Errores Error 1040: Too many connections"
(it is printed only once)
Error in mysql php my admin:
"mysqli_real_connect(): (HY000/1040): Too many connections"
"La conexión para controluser, como está definida en su configuración, fracasó."
(translated: "the connection for controluser, as it is defined in ti's configuration , failed.")
"mysqli_real_connect(): (08004/1040): Too many connections"
Every time you call Query(), you're creating a new database handle. Each active handle needs a unique database connection. Since you're not calling Close, that handle, and thus the connection, remains open until the program exits.
Solve your problem by calling rows.Close() after you're done with each query:
for j := 0; j < 8000; j++ {
rows, err := db.Query("QUERY...")
if err != nil {
logger.Println("Error " + err.Error())
return
}
// Your main logic here
rows.Close()
}
This Close() call is often called in a defer statement, but this precludes the use of a for loop (since a defer only executes when then function returns), so you may want to move your main logic to a new function:
for j := 0; j < 8000; j++ {
doStuff()
}
// later
func doStuff() {
rows, err := db.Query("QUERY...")
if err != nil {
logger.Println("Error " + err.Error())
return
}
defer rows.Close()
// Your main logic here
}
I'm trying to create a very simple Bolt database called "ledger.db" that includes one Bucket, called "Users", which contains Usernames as a Key and Balances as the value that allows users to transfer their balance to one another. I am using Bolter to view the database in the command line
There are two problems, both contained in this transfer function issue resides in the transfer function.
The First: Inside the transfer function is an if/else. If the condition is true, it executes as it should. If it's false, nothing happens. There's no syntax errors and the program runs as though nothing is wrong, it just doesn't execute the else statement.
The Second: Even if the condition is true, when it executes, it doesn't update BOTH the respective balance values in the database. It updates the balance of the receiver, but it doesn't do the same for the sender. The mathematical operations are completed and the values are marshaled into a JSON-compatible format.
The problem is that the sender balance is not updated in the database.
Everything from the second "Success!" fmt.Println() function onward is not processed
I've tried changing the "db.Update()" to "db.Batch()". I've tried changing the order of the Put() functions. I've tried messing with goroutines and defer, but I have no clue how to use those, as I am rather new to golang.
func (from *User) transfer(to User, amount int) error{
var fbalance int = 0
var tbalance int = 0
db, err := bolt.Open("ledger.db", 0600, nil)
if err != nil {
log.Fatal(err)
}
defer db.Close()
return db.Update(func(tx *bolt.Tx) error {
uBuck := tx.Bucket([]byte("Users"))
json.Unmarshal(uBuck.Get([]byte(from.username)), &fbalance)
json.Unmarshal(uBuck.Get([]byte(to.username)), &tbalance)
if (amount <= fbalance) {
fbalance = fbalance - amount
encoded, err := json.Marshal(fbalance)
if err != nil {
return err
}
tbalance = tbalance + amount
encoded2, err := json.Marshal(tbalance)
if err != nil {
return err
}
fmt.Println("Success!")
c := uBuck
err = c.Put([]byte(to.username), encoded2)
return err
fmt.Println("Success!")
err = c.Put([]byte(from.username), encoded)
return err
fmt.Println("Success!")
} else {
return fmt.Errorf("Not enough in balance!", amount)
}
return nil
})
return nil
}
func main() {
/*
db, err := bolt.Open("ledger.db", 0600, nil)
if err != nil {
log.Fatal(err)
}
defer db.Close()
*/
var b User = User{"Big", "jig", 50000, 0}
var t User = User{"Trig", "pig", 40000, 0}
// These two functions add each User to the database, they aren't
// the problem
b.createUser()
t.createUser()
/*
db.View(func(tx *bolt.Tx) error {
c := tx.Bucket([]byte("Users"))
get := c.Get([]byte(b.username))
fmt.Printf("The return value %v",get)
return nil
})
*/
t.transfer(b, 40000)
}
I expect the database to show Big:90000 Trig:0 from the beginning values of Big:50000 Trig:40000
Instead, the program outputs Big:90000 Trig:40000
You return unconditionally:
c := uBuck
err = c.Put([]byte(to.username), encoded2)
return err
fmt.Println("Success!")
err = c.Put([]byte(from.username), encoded)
return err
fmt.Println("Success!")
You are not returning and checking errors.
json.Unmarshal(uBuck.Get([]byte(from.username)), &fbalance)
json.Unmarshal(uBuck.Get([]byte(to.username)), &tbalance)
t.transfer(b, 40000)
And so on.
Debug your code statement by statement.
I want to make dynamic sql in GoLang and I cant seem to find the correct way to do it.
Basically, I just want to do:
query := "SELECT id, email, something FROM User"
var paramValues []string
filterString := ""
if userParams.Name != "" {
paramString += " WHERE id = ?"
paramValues = append(paramValues, userParams.Name)
}
if userParams.UserID != "" {
if len(paramString) > 0 {
paramString += " AND"
} else {
paramString += " WHERE"
}
paramString += " email = ?"
paramValues = append(paramValues, userParams.UserID)
}
stmtOut, err := db.Prepare(query + paramString)
err = stmtOut.QueryRow(paramValues).Scan(&id, &email, &something)
Related to building a dynamic query in mysql and golang
I've been unable to find a solid way to do this that doesn't allow sql injection. The issue with my above solution is that QueryRow() does not take a []string as a parameter.
I want to protect from SQL Injection, so fmt.Sprintf doesn't really solve the problem.
This way I can allow searches on user using either the ID or Email, and I will also use this logic for different objects with more searchable fields.
I'm using go-sql-driver/mysql
Here's something which I can run on my local machine (go1.8 linux/amd64 and current GO MySQL driver 1.3).
Couple of ways are demonstrated.
package main
import (
"database/sql"
"log"
_ "github.com/go-sql-driver/mysql"
"fmt"
)
// var db *sql.DB
// var err error
/*
Database Name/Schema : Test123
Table Name: test
Table Columns and types:
number INT (PRIMARY KEY)
cube INT
*/
func main() {
//Username root, password root
db, err := sql.Open("mysql", "root:root#tcp(127.0.0.1:3306)/Test123?charset=utf8")
if err != nil {
fmt.Println(err) // needs proper handling as per app requirement
return
}
defer db.Close()
err = db.Ping()
if err != nil {
fmt.Println(err) // needs proper handling as per app requirement
return
}
//Prepared statement for inserting data
stmtIns, err := db.Prepare("INSERT INTO test VALUES( ?, ? )") // ? = placeholders
if err != nil {
panic(err.Error()) // needs proper handling as per app requirement
}
defer stmtIns.Close()
//Insert cubes of 1- 10 numbers
for i := 1; i < 10; i++ {
_, err = stmtIns.Exec(i, (i * i * i)) // Insert tuples (i, i^3)
if err != nil {
panic(err.Error()) // proper error handling instead of panic in your app
}
}
num := 3
// Select statement
dataEntity := "cube"
condition := "WHERE number=? AND cube > ?"
finalStatement := "SELECT " + dataEntity + " FROM test " + condition
cubeLowerLimit := 10
var myCube int
err = db.QueryRow(finalStatement, num, cubeLowerLimit).Scan(&myCube)
switch {
case err == sql.ErrNoRows:
log.Printf("No row with this number %d", num)
case err != nil:
log.Fatal(err)
default:
fmt.Printf("Cube for %d is %d\n", num, myCube)
}
var cubenum int
// //Prepared statement for reading data
stmtRead, err := db.Prepare(finalStatement)
if err != nil {
panic(err.Error()) // needs proper err handling
}
defer stmtRead.Close()
// Query for cube of 5
num = 5
err = stmtRead.QueryRow(num, cubeLowerLimit).Scan(&cubenum)
switch {
case err == sql.ErrNoRows:
log.Printf("No row with this number %d", num)
case err != nil:
log.Fatal(err)
default:
fmt.Printf("Cube number for %d is %d\n", num, cubenum)
}
}
If you run it subsequent times, you need to delete the rows in the database so that the inserts won't create a panic (or alternatively change the insert rows code so that it doesn't panic). I haven't tried it on Google App Engine. Hope this helps.
I wrote a script to migrate lots of data from one DB to another and got it working fine, but now I want to try and use goroutines to speed up the script by using concurrent DB calls. Since making the change to calling go processBatch(offset) instead of just processBatch(offset), I can see that a few goroutines are started but the script finishes almost instantly and nothing is actually done. Also the number of started goroutines varies every time I call the script. There are no errors (that I can see).
I'm still new to goroutines and Go in general, so any pointers as to what I might be doing wrong are much appreciated. I have removed all logic from the code below that is not related to concurrency or DB access, as it runs fine without the changes. I also left a comment where I believe it fails, as nothing below that line is run (Print gives not output). I also tried using sync.WaitGroup to stagger DB calls, but it didn't seem to change anything.
var (
legacyDB *sql.DB
v2DB *sql.DB
)
func main() {
var total, loops int
var err error
legacyDB, err = sql.Open("mysql", "...")
if err != nil {
panic(err)
}
defer legacyDB.Close()
v2DB, err = sql.Open("mysql", "...")
if err != nil {
panic(err)
}
defer v2DB.Close()
err = legacyDB.QueryRow("SELECT count(*) FROM users").Scan(&total)
checkErr(err)
loops = int(math.Ceil(float64(total) / float64(batchsize)))
fmt.Println("Total: " + strconv.Itoa(total))
fmt.Println("Loops: " + strconv.Itoa(loops))
for i := 0; i < loops; i++ {
offset := i * batchsize
go processBatch(offset)
}
legacyDB.Close()
v2DB.Close()
}
func processBatch(offset int) {
query := namedParameterQuery.NewNamedParameterQuery(`
SELECT ...
LIMIT :offset,:batchsize
`)
query.SetValue(...)
rows, err := legacyDB.Query(query.GetParsedQuery(), (query.GetParsedParameters())...)
// nothing after this line gets done (Println here does not show output)
checkErr(err)
defer rows.Close()
....
var m runtime.MemStats
runtime.ReadMemStats(&m)
log.Printf("\nAlloc = %v\nTotalAlloc = %v\nSys = %v\nNumGC = %v\n\n", m.Alloc/1024/1024, m.TotalAlloc/1024/1024, m.Sys/1024/1024, m.NumGC)
}
func checkErr(err error) {
if err != nil {
panic(err)
}
}
As Nadh mentioned in a comment, that would be because the program exits when the main function finishes, regardless whether or not there are still other goroutines running. To fix this, a *sync.WaitGroup will suffice. A WaitGroup is used for cases where you have multiple concurrent operations, and you would like to wait until they have all completed. Documentation can be found here: https://golang.org/pkg/sync/#WaitGroup.
An example implementation for your program without the use of global variables would look like replacing
fmt.Println("Total: " + strconv.Itoa(total))
fmt.Println("Loops: " + strconv.Itoa(loops))
for i := 0; i < loops; i++ {
offset := i * batchsize
go processBatch(offset)
}
with
fmt.Println("Total: " + strconv.Itoa(total))
fmt.Println("Loops: " + strconv.Itoa(loops))
wg := new(sync.WaitGroup)
wg.Add(loops)
for i := 0; i < loops; i++ {
offset := i * batchsize
go func(offset int) {
defer wg.Done()
processBatch(offset)
}(offset)
}
wg.Wait()