The problem
I wrote an application which synchronizes data from BigQuery into a MySQL database. I try to insert roughly 10-20k rows in batches (up to 10 items each batch) every 3 hours. For some reason I receive the following error when it tries to upsert these rows into MySQL:
Can't create more than max_prepared_stmt_count statements:
Error 1461: Can't create more than max_prepared_stmt_count statements
(current value: 2000)
My "relevant code"
// ProcessProjectSkuCost receives the given sku cost entries and sends them in batches to upsertProjectSkuCosts()
func ProcessProjectSkuCost(done <-chan bigquery.SkuCost) {
var skuCosts []bigquery.SkuCost
var rowsAffected int64
for skuCostRow := range done {
skuCosts = append(skuCosts, skuCostRow)
if len(skuCosts) == 10 {
rowsAffected += upsertProjectSkuCosts(skuCosts)
skuCosts = []bigquery.SkuCost{}
}
}
if len(skuCosts) > 0 {
rowsAffected += upsertProjectSkuCosts(skuCosts)
}
log.Infof("Completed upserting project sku costs. Affected rows: '%d'", rowsAffected)
}
// upsertProjectSkuCosts inserts or updates ProjectSkuCosts into SQL in batches
func upsertProjectSkuCosts(skuCosts []bigquery.SkuCost) int64 {
// properties are table fields
tableFields := []string{"project_name", "sku_id", "sku_description", "usage_start_time", "usage_end_time",
"cost", "currency", "usage_amount", "usage_unit", "usage_amount_in_pricing_units", "usage_pricing_unit",
"invoice_month"}
tableFieldString := fmt.Sprintf("(%s)", strings.Join(tableFields, ","))
// placeholderstring for all to be inserted values
placeholderString := createPlaceholderString(tableFields)
valuePlaceholderString := ""
values := []interface{}{}
for _, row := range skuCosts {
valuePlaceholderString += fmt.Sprintf("(%s),", placeholderString)
values = append(values, row.ProjectName, row.SkuID, row.SkuDescription, row.UsageStartTime,
row.UsageEndTime, row.Cost, row.Currency, row.UsageAmount, row.UsageUnit,
row.UsageAmountInPricingUnits, row.UsagePricingUnit, row.InvoiceMonth)
}
valuePlaceholderString = strings.TrimSuffix(valuePlaceholderString, ",")
// put together SQL string
sqlString := fmt.Sprintf(`INSERT INTO
project_sku_cost %s VALUES %s ON DUPLICATE KEY UPDATE invoice_month=invoice_month`, tableFieldString, valuePlaceholderString)
sqlString = strings.TrimSpace(sqlString)
stmt, err := db.Prepare(sqlString)
if err != nil {
log.Warn("Error while preparing SQL statement to upsert project sku costs. ", err)
return 0
}
// execute query
res, err := stmt.Exec(values...)
if err != nil {
log.Warn("Error while executing statement to upsert project sku costs. ", err)
return 0
}
rowsAffected, err := res.RowsAffected()
if err != nil {
log.Warn("Error while trying to access affected rows ", err)
return 0
}
return rowsAffected
}
// createPlaceholderString creates a string which will be used for prepare statement (output looks like "(?,?,?)")
func createPlaceholderString(tableFields []string) string {
placeHolderString := ""
for range tableFields {
placeHolderString += "?,"
}
placeHolderString = strings.TrimSuffix(placeHolderString, ",")
return placeHolderString
}
My question:
Why do I hit the max_prepared_stmt_count when I immediately execute the prepared statement (see function upsertProjectSkuCosts)?
I could only imagine it's some sort of concurrency which creates tons of prepared statements in the meantime between preparing and executing all these statements. On the other hand I don't understand why there would be so much concurrency as the channel in the ProcessProjectSkuCost is a buffered channel with a size of 20.
You need to close the statement inside upsertProjectSkuCosts() (or re-use it - see the end of this post).
When you call db.Prepare(), a connection is taken from the internal connection pool (or a new connection is created, if there aren't any free connections). The statement is then prepared on that connection (if that connection isn't free when stmt.Exec() is called, the statement is then also prepared on another connection).
So this creates a statement inside your database for that connection. This statement will not magically disappear - having multiple prepared statements in a connection is perfectly valid. Golang could see that stmt goes out of scope, see it requires some sort of cleanup and then do that cleanup, but Golang doesn't (just like it doesn't close files for you and things like that). So you'll need to do that yourself using stmt.Close(). When you call stmt.Close(), the driver will send a command to the database server, telling it the statement is no longer needed.
The easiest way to do this is by adding defer stmt.Close() after the err check following db.Prepare().
What you can also do, is prepare the statement once and make that available for upsertProjectSkuCosts (either by passing the stmt into upsertProjectSkuCosts or by making upsertProjectSkuCosts a func of a struct, so the struct can have a property for the stmt). If you do this, you should not call stmt.Close() - because you aren't creating new statements anymore, you are re-using an existing statement.
Also see Should we also close DB's .Prepare() in Golang? and https://groups.google.com/forum/#!topic/golang-nuts/ISh22XXze-s
Related
I have this MySQL database where I need to add records with a go program and need to retrieve the id of the last added record, to add the id to another table.
When i run insert INSERT INTO table1 values("test",1); SELECT LAST_INSERT_ID() in MySQL Workbench, it returns the last id, which is auto incremented, with no issues.
If I run my go code however, it always prints 0. The code:
_, err := db_client.DBClient.Query("insert into table1 values(?,?)", name, 1)
var id string
err = db_client.DBClient.QueryRow("SELECT LAST_INSERT_ID()").Scan(&id)
if err != nil {
panic(err.Error())
}
fmt.Println("id: ", id)
I tried this variation to try to narrow down the problem scope further: err = db_client.DBClient.QueryRow("SELECT id from table1 where name=\"pleasejustwork\";").Scan(&id), which works perfectly fine; go returns the actual id.
Why is it not working with the LAST_INSERT_ID()?
I'm a newbie in go so please do not go hard on me if i'm making stupid go mistakes that lead to this error :D
Thank you in advance.
The MySQL protocol returns LAST_INSERT_ID() values in its response to INSERT statements. And, the golang driver exposes that returned value. So, you don't need the extra round trip to get it. These ID values are usually unsigned 64-bit integers.
Try something like this.
res, err := db_client.DBClient.Exec("insert into table1 values(?,?)", name, 1)
if err != nil {
panic (err.Error())
}
id, err := res.LastInsertId()
if err != nil {
panic (err.Error())
}
fmt.Println("id: ", id)
I confess I'm not sure why your code didn't work. Whenever you successfully issue a single-row INSERT statement, the next statement on the same database connection always has access to a useful LAST_INSERT_ID() value. This is true whether or not you use explicit transactions.
But if your INSERT is not successful, you must treat the last insert ID value as unpredictable. (That's a technical term for "garbage", trash, rubbish, basura, etc.)
I am developing a system to enable patient registration with incremental queue number. I am using Go, GORM, and MySQL.
An issue happens when more than one patients are registering at the same time, they tend to get the same queue number which it should not happen.
I attempted using transactions and hooks to achieve that but I still got duplicate queue number. I have not found any resource about how to lock the database when a transaction is happening.
func (r repository) CreatePatient(pat *model.Patient) error {
tx := r.db.Begin()
defer func() {
if r := recover(); r != nil {
tx.Rollback()
}
}()
err := tx.Error
if err != nil {
return err
}
// 1. get latest queue number and assign it to patient object
var queueNum int64
err = tx.Model(&model.Patient{}).Where("registration_id", pat.RegistrationID).Select("queue_number").Order("created_at desc").First(&queueNum).Error
if err != nil && err != gorm.ErrRecordNotFound {
tx.Rollback()
return err
}
pat.QueueNumber = queueNum + 1
// 2. write patient data into the db
err = tx.Create(pat).Error
if err != nil {
tx.Rollback()
return err
}
return tx.Commit().Error
}
As stated by #O. Jones, transactions don't save you here because you're extracting the largest value of a column, incrementing it outside the db and then saving that new value. From the database's point of view the updated value has no dependence on the queried value.
You could try doing the update in a single query, which would make the dependence obvious:
UPDATE patient AS p
JOIN (
SELECT max(queue_number) AS queue_number FROM patient WHERE registration_id = ?
) maxp
SET p.queue_number = maxp.queue_number + 1
WHERE id = ?
In gorm you can't run a complex update like this, so you'll need to make use of Exec.
I'm not 100% certain the above will work because I'm less familiar with MySQL transaction isolation guarantees.
A cleaner way
Overall, it'd be cleaner to keep a table of queues (by reference_id) with a counter that you update atomically:
Start a transaction, then
SELECT queue_number FROM queues WHERE registration_id = ? FOR UPDATE;
Increment the queue number in your app code, then
UPDATE queues SET queue_number = ? WHERE registration_id = ?;
Now you can use the incremented queue number in your patient creation/update before transaction commit.
I am trying to do a simple thing, check if there is a table, if not then create that table in database.
Here this is the logic I used.
test := "June_2019"
sql_query := `select * from ` + test + `;`
read_err := db.QueryRow(sql_query, 5)
error_returned := read_err.Scan(read_err)
defer db.Close()
if error_returned == nil {
fmt.Println("table is there")
} else {
fmt.Println("table not there")
}
In my database I have June_2019 table. But still this code returns me not nil value. I used db.QueryRow(sql_query, 5) 5 as I have five colomns in my table.
What am I missing here? I am still learning golang.
Thanks in advance.
I have solved the problem using golang and MySQL.
_, table_check := db.Query("select * from " + table + ";")
if table_check == nil {
fmt.Println("table is there")
} else {
fmt.Println("table not there")
}
I have used db.Query() which returns values and and error, here I checked only error.
I think most people thought I want to do it in MySQL way, I just wanted to learn how to use golang to do MySQL operations.
I'm using the github.com/go-sql-driver/mysql and mysql 5.7.10. I have a function:
bulkSetStatus := func(docVers []*_documentVersion) error {
if len(docVers) > 0 {
query := strings.Repeat("CALL documentVersionSetStatus(?, ?); ", len(docVers))
args := make([]interface{}, 0, len(docVers)*2)
for _, docVer := range docVers {
args = append(args, docVer.Id, docVer.Status)
}
_, err := db.Exec(query, args...)
return err
}
return nil
}
which works if len(docVers) == 1 but when there are more, resulting in multiple CALLs to the stored procedure, it errors:
Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CALL documentVersionSetStatus(?, ?)' at line 1
I have also tried a newline character between each call but I get the same error. If I run this in mysql workbench with multiple CALLs to this procedure it works fine, I'm not sure what is wrong with the syntax here.
I have logged out the exact full text with the arguments and it is as expected:
CALL documentVersionSetStatus("9c71cac14a134e7abbc4725997d90d2b", "inprogress"); CALL documentVersionSetStatus("beb65318da96406fa92990426a279efa", "inprogress");
go-sql-driver, by default, does not allow you to have multiple statements in one query (as you are doing by chaining together multiple CALL statements like that) due to the security implications if an attacker manages to perform SQL injection (for example, by injecting 0 OR 0; DROP TABLE foo).
To allow this, you must explicitly enable it by passing multiStatements parameter when connecting to the database, e.g.
db, err := sql.Open("mysql", "user:password#/dbname?multiStatements=True")
Source: https://github.com/go-sql-driver/mysql#multistatements
I have fixed the proc call by doing some manual string interpolation for the parameters instead of using the correct ? way of doing it:
bulkSetStatus := func(docVers []*_documentVersion) error {
if len(docVers) > 0 {
query := strings.Repeat("CALL documentVersionSetStatus(%q, %q); ", len(docVers))
args := make([]interface{}, 0, len(docVers)*2)
for _, docVer := range docVers {
args = append(args, docVer.Id, docVer.Status)
}
_, err := db.Exec(fmt.Sprintf(query, args...))
return err
}
return nil
}
so I swap out the ? for %q and us fmt.Sprintf to inject the parameters, I should note that slugonamission's answer is partially correct, I did need to add the connection string parameter multiStatements=true in order to get this to work with my other changes. I will log an issue on the github repo it looks like there may be some param interpolation issue when there is more than one statement, I think the error was happening because the mysql db was trying to run the script with ? literals in it.
So I have been rewriting an old PHP system to Go looking for some performance gains but I'm not get any. And the problem seems to be in the Inserts i'm doing into Mysql.
So where PHP does some processing of a CSV file, does some hashing and inserts around 10k rows in MySQL it takes 40 seconds (unoptimized code).
Now Go on the other hand stripped away of any processing and just the same inserting of 10k(empty) rows takes 110 seconds.
Both tests are run on the same machine and I'm using the go-mysql-driver.
Now for some Go code:
This is extremely dumbed down code and this still takes almost 2 minutes, compared to PHP which does it in less then half.
db := GetDbCon()
defer db.Close()
stmt, _ := db.Prepare("INSERT INTO ticket ( event_id, entry_id, column_headers, column_data, hash, salt ) VALUES ( ?, ?, ?, ?, ?, ? )")
for i := 0; i < 10000; i++{
//CreateTicket(columns, line, storedEvent)
StoreTicket(models.Ticket{int64(0), storedEvent.Id, int64(i),
"", "", "", "", int64(0), int64(0)}, *stmt)
}
//Extra functions
func StoreTicket(ticket models.Ticket, stmt sql.Stmt){
stmt.Exec(ticket.EventId, ticket.EntryId, ticket.ColumnHeaders, ticket.ColumnData, ticket.Hash, ticket.Salt)
}
func GetDbCon() (sql.DB) {
db, _ := sql.Open("mysql", "bla:bla#/bla")
return *db
}
Profiler result
So is it my code, the go-mysql-driver or is this normal and is PHP just really fast in inserting records?
==EDIT==
As per requested, I have recorded both PHP and Go runs with tcpdump:
The files:
Go Tcpdump
Go Textdump
PHP Tcpdump
PHP Textdump
I have a hard time reaching any conclusions comparing the two logs, both seem to be sending the same size packets back and forth. But with Go(~110) mysql seems to almost take twice as long to process the request then with PHP(~44), also Go seems to wait slightly longer before sending a new request again(the difference is minimal though).
It's an old question but still - better late than never; you're in for a treat:
put all your data into a bytes.Buffer as tab-separated, newline terminated and unquoted lines (if the text causes problems, it has to be escaped first). NULL has to be encoded as \N.
Use http://godoc.org/github.com/go-sql-driver/mysql#RegisterReaderHandler and register a function returning that buffer under "instream". Next, call LOAD DATA LOCAL INFILE "Reader::instream" INTO TABLE ... - that's a very fast way to pump data into MySQL (I measured about 19 MB/sec with Go from a file piped from stdin compared to 18 MB/sec for the MySQL command line client when uploading data from stdin).
As far as I know, that very driver is the only way to LOAD DATA LOCAL INFILE without the need of a file.
I notice you're not using a transaction, if you're a using a vanilla mysql 5.x with InnoDB this will be a huge performance drag as it will auto-commit on every insert.
func GetDbCon() (sql.DB) {
db, _ := sql.Open("mysql", "bla:bla#/bla")
return *db
}
func PrepareTx(db *db.DB,qry string) (tx *db.Tx, s *db.Stmt, e error) {
if tx,e=db.Begin(); e!=nil {
return
}
if s, e = tx.Prepare(qry);e!=nil {
tx.Close()
}
return
}
db := GetDbCon()
defer db.Close()
qry := "INSERT INTO ticket ( event_id, entry_id, column_headers, column_data, hash, salt ) VALUES ( ?, ?, ?, ?, ?, ? )"
tx,stmt,e:=PrepareTx(db,qry)
if e!=nil {
panic(e)
}
defer tx.Rollback()
for i := 0; i < 10000; i++{
ticket:=models.Ticket{int64(0), storedEvent.Id, int64(i),"", "", "", "", int64(0), int64(0)}
stmt.Exec(ticket.EventId, ticket.EntryId, ticket.ColumnHeaders, ticket.ColumnData, ticket.Hash, ticket.Salt)
// To avoid huge transactions
if i % 1000 == 0 {
if e:=tx.Commit();e!=nil {
panic(e)
} else {
// can only commit once per transaction
tx,stmt,e=PrepareTx(db,qry)
if e!=nil {
panic(e)
}
}
}
}
// Handle left overs - should also check it isn't already committed
if e:=tx.Commit();e!=nil {
panic(e)
}