Groovy withBatch is really slow

Groovy withBatch is really slow - mysql

I'm trying to use Groovy withBatch function and it's really slow (15 sec). I've tried with different batch sizes (10, 400 ...) and it always take a lot of time doing each batch.
It's the second query that I write with it and there are both slow.
Here's my code. Is there a bug in it or am I using it the wrong way ?
static updateCSProducts(def conn, def webProductRows){
conn.withBatch(400, """
UPDATE cscart_products p
SET usergroup_ids=:usergroup_ids,
b2b_group_ids=:b2b_group_ids,
b2b_desc_hide=:b2b_desc_hide
WHERE product_code = :product_code
OR product_id = (SELECT product_id FROM cscart_product_options_inventory WHERE product_code = :product_code)
""") { ps ->
webProductRows.each({row ->
ProductType type = ProductType.fromCode(row.type)
String userGroupIds = type.getProductAvailabilityUserGroup().collect{it.getId()}.join(",")
String b2bGroupIds = type.getB2bUserGroup().collect{it.getId()}.join(",")
boolean b2bDescHide = !type.getB2bUserGroup().isEmpty()
println row.id + " " + userGroupIds + " " + b2bGroupIds + " " + b2bDescHide
ps.addBatch(product_code:row.id, usergroup_ids:userGroupIds, b2b_group_ids:b2bGroupIds, b2b_desc_hide:b2bDescHide)
})
}
}
I'm using MySql as Database. When I'm looking at SQL connections, I don't really see any connection running a query while I'm waiting for the next batch.
EDIT:
I've removed the queries and it still very slow.
Heres the updated version:
conn.withBatch(400, """
UPDATE cscart_products p
SET usergroup_ids=:usergroup_ids,
b2b_group_ids=:b2b_group_ids,
b2b_desc_hide=:b2b_desc_hide
WHERE p.product_code = :product_code
""") { ps ->
webProductRows.each({row ->
ProductType type = ProductType.fromCode(row.type)
String userGroupIds = type.getProductAvailabilityUserGroup().collect{it.getId()}.join(",")
String b2bGroupIds = type.getB2bUserGroup().collect{it.getId()}.join(",")
String b2bDescHide = !type.getB2bUserGroup().isEmpty() ? 'Y' : 'N'
println row.id + " " + userGroupIds + " " + b2bGroupIds + " " + b2bDescHide
ps.addBatch(product_code:row.id, usergroup_ids:userGroupIds, b2b_group_ids:b2bGroupIds, b2b_desc_hide:b2bDescHide)
})
}

You're running tons of queries on each and every update. You'd be better off retrieving a list of data and then looping over that. It's not withBatch that's the bottleneck, it's your implementation.

Related

Device dependency in ZABBIX 4.2

Suppose the following scenario in using Zabbix 4.2. We have a core switch, two distributed switches and 20 access switches, where the distributed switches are connected to the core and 10 access switches are connected to each distributed switch. I am monitoring all of them using SNMP v2c and using the template cisco switches (the official one). Now the problem arises as I cannot define device dependency in zabbix easily. By easily, I mean that if a distributed switch goes out, I want to have the alarm for that device and not for all access switches connected to it. I could define it as follows. Change the triggers for each device and made them dependent on the corresponding trigger for distributed switches. However, this is too time consuming. What should I do? Any help is appreciated.

You are right, there isn't an easy way to set this kind of dependancy.
I had to manage the same situation a while ago and I wrote a python dependancy setter which uses a "dependent hostgroup <--> master host" logic.
You can modify it to fit your needs (see masterTargetTriggerDescription and slaveTargetTriggerDescription for the dependancy targets), it works but contains little error checking: use at your own risk!
import csv
import re
import json
from zabbix.api import ZabbixAPI
# Zabbix Server endpoint
zabbixServer = 'https://yourzabbix/zabbix/'
zabbixUser = 'admin'
zabbixPass = 'zabbix'
zapi = ZabbixAPI(url=zabbixServer, user=zabbixUser, password=zabbixPass)
# Hostgrop variables - to reference IDs while building API parameters
hostGroupNames = [] # list = array
hostGroupId = {} # dict = associative array
# Csv file for dep settings - see the format:
"""
Hostgroup;Master
ACCESS_1;DistSwitch1
ACCESS_2;DistSwitch1
ACCESS_5;DistSwitch2
ACCESS_6;DistSwitch2
DIST;CoreSwitch1
"""
fileName = 'dependancy.csv'
masterTargetTriggerDescription = '{HOST.NAME} is unavailable by ICMP'
slaveTargetTriggerDescription = '{HOST.NAME} is unavailable by ICMP|Zabbix agent on {HOST.NAME} is unreachable'
# Read CSV file
hostFile = open(fileName)
hostReader = csv.reader(hostFile, delimiter=';', quotechar='|')
hostData = list(hostReader)
# CSV Parsing
for line in hostData:
hostgroupName = line[0]
masterName = line[1]
slaveIds = []
masterId = zapi.get_id('host', item=masterName, with_id=False, hostid=None)
hostGroupId = zapi.get_id('hostgroup', item=hostgroupName, with_id=False, hostid=None)
masterTriggerObj = zapi.trigger.get(hostids=masterId, filter=({'description': masterTargetTriggerDescription}) )
print "Group: " + hostgroupName + " - ID: " + str(hostGroupId)
print "Master host: " + masterName + " - ID: " + str(masterId)
print "Master trigger: " + masterTriggerObj[0]['description'] + " - ID: " + str(masterTriggerObj[0]['triggerid'])
# cycle through slave hosts
hostGroupObj = zapi.hostgroup.get(groupids=hostGroupId, selectHosts='extend')
for host in hostGroupObj[0]['hosts']:
#exclude master
if host['hostid'] != str(masterId):
print " - Host Name: " + host['name'] + " - ID: " + host['hostid'] + " - MASTER: " + str(masterId)
# cycle for all slave's triggers
slaveTargetTriggerObj = zapi.trigger.get(hostids=host['hostid'])
#print json.dumps(slaveTargetTriggerObj)
for slaveTargetTrigger in slaveTargetTriggerObj:
# search for dependancy targets
if re.search(slaveTargetTriggerDescription, slaveTargetTrigger['description'] ,re.IGNORECASE):
print " - Trigger: " + slaveTargetTrigger['description'] + " - ID: " + slaveTargetTrigger['triggerid']
# Clear existing dep. from the trigger, then create the new dep.
clear = zapi.trigger.deletedependencies(triggerid=slaveTargetTrigger['triggerid'].encode())
result = zapi.trigger.adddependencies(triggerid=slaveTargetTrigger['triggerid'].encode(), dependsOnTriggerid=masterTriggerObj[0]['triggerid'])
print "----------------------------------------"
print ""

I updated the code contributed by Simone Zabberoni and rewritten it to work with Python 3, PyZabbix, and YAML.
#!/usr/bin/python3
import re
import yaml
#https://pypi.org/project/py-zabbix/
from pyzabbix import ZabbixAPI
# Zabbix Server endpoint
zabbix_server = 'https://zabbix.example.com/zabbix/'
zabbix_user = 'zbxuser'
zabbix_pass = 'zbxpassword'
# Create ZabbixAPI class instance
zapi = ZabbixAPI(zabbix_server)
# Enable HTTP auth
zapi.session.auth = (zabbix_user, zabbix_pass)
# Login (in case of HTTP Auth, only the username is needed, the password, if passed, will be ignored)
zapi.login(zabbix_user, zabbix_pass)
# Hostgrop variables - to reference IDs while building API parameters
hostGroupNames = [] # list = array
hostGroupId = {} # dict = associative array
# yaml file for dep settings - see the format:
"""
pvebar16 CTs:
master: pvebar16.example.com
masterTargetTriggerDescription: 'is unavailable by ICMP'
slaveTargetTriggerDescription: 'is unavailable by ICMP|Zabbix agent is unreachable for 5 minutes'
"""
fileName = 'dependancy.yml'
with open('dependancy.yml') as f:
hostData = yaml.load(f)
for groupyml in hostData.keys():
masterTargetTriggerDescription = hostData[groupyml]['masterTargetTriggerDescription']
slaveTargetTriggerDescription = hostData[groupyml]['slaveTargetTriggerDescription']
masterName = hostData[groupyml]['master']
hostgroupName = groupyml
slaveIds = []
masterId = zapi.host.get(filter={'host': masterName},output=['hostid'])[0]['hostid']
hostGroupId = zapi.hostgroup.get(filter={'name': hostgroupName},output=['groupid'])[0]['groupid']
masterTriggerObj = zapi.trigger.get(host=masterName, filter={'description': masterTargetTriggerDescription}, output=['triggerid','description'])
print("Group: " + hostgroupName + " - ID: " + str(hostGroupId))
print("Master host: " + masterName + " - ID: " + str(masterId))
print("Master trigger: " + masterTriggerObj[0]['description'] + " - ID: " + str(masterTriggerObj[0]['triggerid']))
# cycle through slave hosts
hostGroupObj = zapi.hostgroup.get(groupids=hostGroupId, selectHosts='extend')
for host in hostGroupObj[0]['hosts']:
#exclude master
if host['hostid'] != str(masterId):
print(" - Host Name: " + host['name'] + " - ID: " + host['hostid'] + " - MASTER: " + str(masterId))
# cycle for all slave's triggers
slaveTargetTriggerObj = zapi.trigger.get(hostids=host['hostid'])
for slaveTargetTrigger in slaveTargetTriggerObj:
# search for dependancy targets
if re.search(slaveTargetTriggerDescription, slaveTargetTrigger['description'] ,re.IGNORECASE):
print(" - Trigger: " + slaveTargetTrigger['description'] + " - ID: " + slaveTargetTrigger['triggerid'])
# Clear existing dep. from the trigger, then create the new dep.
clear = zapi.trigger.deletedependencies(triggerid=slaveTargetTrigger['triggerid'])
result = zapi.trigger.adddependencies(triggerid=slaveTargetTrigger['triggerid'], dependsOnTriggerid=masterTriggerObj[0]['triggerid'])
print("----------------------------------------")
print("")

Why is setting SQL variable is creating an error?

Following SQL statement is creating an error with message :
"Message: Fatal error encountered during command execution."
"Inner exception: Parameter '#LastUserID' must be defined."
If I directly use LAST_INSERT_ID() instead of LastUserID, it always returns zero (hence fails at second insert) when executed like this.
I don't see my syntax is different than in mySQL document.
Could some one help me ?
string Query = #"INSERT INTO login (" +
"LOGIN_EMAIL," +
"LOGIN_PASSWORD," +
"LOGIN_SALT," +
"LOGIN_LAST_LOGIN_DATE," +
// "LOGIN_LAST_LOGIN_LOCATION," +
"LOGIN_ACCOUNT_STATUS," +
"LOGIN_LOGIN_ATTEMPTS," +
"LOGIN_CREATED_DATE) " +
"VALUES (" +
"#Parameter2," +
"#Parameter3," +
"#Parameter4," +
"#Parameter5," +
// "#Parameter6," +
"#Parameter6," +
"#Parameter7," +
"#Parameter8); " +
"SET #LastUserID = LAST_INSERT_ID(); " +
"INSERT INTO user_role (" +
"USER_ROLE_USER_ID," +
"USER_ROLE_ROLE," +
"USER_ROLE_STATUS," +
"USER_ROLE_CREATED_DATE) " +
"SELECT " +
"#LastUserID," +
"#Parameter9," +
"#Parameter10," +
"#Parameter11 " +
"FROM dual WHERE NOT EXISTS (SELECT USER_ROLE_USER_ID FROM user_role " +
"WHERE USER_ROLE_USER_ID = #LastUserID AND USER_ROLE_ROLE = #Parameter9)";
MySqlCommand oCommand = new MySqlCommand(Query, oMySQLConnecion);
oCommand.Transaction = tr;

Create a procedure in which you first do your insert, cache the last inserted id, do the other insert and let it print out your parameters with a bool if your last insert worked or not. That way you can debug it properly.
In general you should avoid concatinating strings to generate sql-commands or you might get troubles with parameters containing unexpected characters or be hit by a injection.

Simple fix : Replace "$LastUserID" with "$'LastUserID'". The ephostophy makes the difference.

c++ Mysql, can not close a Mysql connection

I have a function that receives parameters (schema name, column name etc) and updates a Mysql table, the problem is that when I use two Mysql commands inside this function (below), one to set the schema and one to update the table, the Close connection command at the end `(conDataBase3->Close();) does not work.
I am checking the number of open connections in the Mysql console (SHOW FULLPROCESSLIST) before and after running the function. any solutions or explanations? thanks
int simple_1::update_table_with_value(gcroot<String^ > schema, gcroot<String^ > table_name, int numerator, gcroot<String^ > field_to_update, double value_to_update)
{
gcroot<MySqlConnection^ > conDataBase3;
conDataBase3 = gcnew MySqlConnection(constring);
conDataBase3->Open();
try{
String ^ schema_name = "Use " + schema + " ;";
MySqlCommand ^cmdDataBase3 = gcnew MySqlCommand(schema_name, conDataBase3);
MySqlCommand ^cmdDataBase4 = gcnew MySqlCommand(schema_name, conDataBase3);
cmdDataBase3->ExecuteNonQuery();
String ^ temp1 = "UPDATE ";
String ^ temp2 = table_name;
String ^ temp3 = " SET ";
String ^ temp4 = field_to_update;
String ^ temp6 = "=(#value1) WHERE numerator = (#value2)";
String ^ temp8 = temp1 + temp2 + temp3 + temp4 + temp6;
// end of the writing part
cmdDataBase4 = gcnew MySqlCommand(temp8, conDataBase3);
cmdDataBase4->Parameters->AddWithValue("#value1", value_to_update);
cmdDataBase4->Parameters->AddWithValue("#value2", numerator);
cmdDataBase4->Prepare();
cmdDataBase4->ExecuteNonQuery();
}//try
catch (Exception^ ex)
{
System::Windows::Forms::MessageBox::Show(ex->Message);
}
conDataBase3->Close();
int answer = 0;
return (answer);
}

ok found the answer, I needed to disable the pooling option otherwise closing the connection still keeps the socket open..
found it here: http://bugs.mysql.com/bug.php?id=24138 see the last 2 lines.
The way to disable the pooling :
http://www.connectionstrings.com/mysql-connector-net-mysqlconnection/disable-connection-pooling/

subprocess.popen returning empty string

There was an earlier question on this, but the asker was just overwriting their output and solved their own problem.
I'm using a subprocess.popen to read video information and write the output to a json. It works fine on MOST videos, but on others is returning an empty string on others - even though it runs fine from the command line. I tried it several times and am getting the data fine through the command line.
Here's the relevant part of the script:
out_prj.write('[')
for m, i in enumerate(files):
print i
out_prj.write('{"$type":"BatchProcessor.Job, BatchProcessor","Id":0,"Ver":1.02,"CurrentTask":0,"IsSelected":true,"TaskList":[')
f_name = os.path.basename(i[0])
f_json = out_folder + os.sep + "06_Output" + os.sep + os.path.basename(i[0]).split(".")[0] + ".json"
trans_f = out_folder + os.sep + "04_Video" + os.sep + os.path.basename(i[0]).split(".")[0] + "-tr.ts"
trans_f_out = out_folder + os.sep + "06_Output" + os.sep + os.path.basename(i[0]).split(".")[0] + "-tr-out.ts"
ffprobe = 'ffprobe.exe'
command = [ffprobe, '-v', 'quiet', '-print_format', 'json', '-show_format', '-show_streams', i[0]]
p = sp.Popen(command, stdout=sp.PIPE, stderr=sp.PIPE, shell=True)
out, err = p.communicate()
io = cStringIO.StringIO(out)
info = json.load(io)
print info
filea = open(f_json, 'w')
filea.write(json.dumps(info))
filea.close()
f = open(f_json)
b = json.load(f)
print b
#########################
###################
f_format = str(b['streams'][0]['codec_long_name'])

Your code ignores error messages (err variable). print err or don't redirect stderr to see them.
Unrelated: the json handling in your code is insane: most operations are redundant.
To save output of the subprocess to a file:
import os
from subprocess import check_call
f_json = os.path.join(out_folder, "06_Output",
os.path.splitext(f_name)[0] + ".json")
with open(f_json, 'wb', 0) as file:
check_call(command, stdout=file)
Note: shell=True is not necessary here. If subprocess can't find ffprobe.exe then specify the full path e.g. (use the path appropriate for your system):
ffprobe = r'C:\Program Files\Real\RealPlayer\RPDS\Tools\ffmpeg\ffprobe.exe'
Note: r'' -- a raw string literal is used to avoid doubling the backslashes.

MySQL to Postgres conversion

Does anyone know what could be causing this error? I'm trying to convert a MySQL site to Postgres so I can host on Heroku. I'm new to database syntax, and this problem has been bugging me for days.
PG::Error: ERROR: syntax error at or near "ON"
LINE 1: ...tores ("key", "value") VALUES ('traffic:hits', 0) ON DUPLICA...
^
Here's the github page for the site I'm trying to convert. https://github.com/jcs/lobsters
This is the query. I added the backslash double quotes in replace of `.
if Rails.env == "test"
Keystore.connection.execute("INSERT OR IGNORE INTO " <<
"#{Keystore.table_name} (\"key\", \"value\") VALUES " <<
"(#{q(key)}, 0)")
Keystore.connection.execute("UPDATE #{Keystore.table_name} " <<
"SET \"value\" = \"value\" + #{q(amount)} WHERE \"key\" = #{q(key)}")
else
Keystore.connection.execute("INSERT INTO #{Keystore.table_name} (" +
"\"key\", \"value\") VALUES (#{q(key)}, #{q(amount)}) ON DUPLICATE KEY " +
"UPDATE \"value\" = \"value\" + #{q(amount)}")
end

Postgres' INSERT doesn't support MySQL's variant INSERT ... ON DUPLICATE KEY UPDATE.
For alternatives see the answers to this question.

I was working on this exact code last night, here's an initial take at how I fixed it, following this answer :
def self.put(key, value)
key_column = Keystore.connection.quote_column_name("key")
value_column = Keystore.connection.quote_column_name("value")
if Keystore.connection.adapter_name == "SQLite"
Keystore.connection.execute("INSERT OR REPLACE INTO " <<
"#{Keystore.table_name} (#{key_column}, #{value_column}) VALUES " <<
"(#{q(key)}, #{q(value)})")
elsif Keystore.connection.adapter_name == "PostgreSQL"
Keystore.connection.execute("UPDATE #{Keystore.table_name} " +
"SET #{value_column} =#{q(value)} WHERE #{key_column} =#{q(key)}")
Keystore.connection.execute("INSERT INTO #{Keystore.table_name} (#{key_column}, #{value_column}) " +
"SELECT #{q(key)}, #{q(value)} " +
"WHERE NOT EXISTS (SELECT 1 FROM #{Keystore.table_name} WHERE #{key_column} = #{q(key)}) "
)
elsif Keystore.connection.adapter_name == "MySQL" || Keystore.connection.adapter_name == "Mysql2"
Keystore.connection.execute("INSERT INTO #{Keystore.table_name} (" +
"#{key_column}, #{value_column}) VALUES (#{q(key)}, #{q(value)}) ON DUPLICATE KEY " +
"UPDATE #{value_column} = #{q(value)}")
else
raise "Error: keystore requires db-specific put method."
end
true
end
There's a number of things to be fixed in the lobsters codebase beyond just this for postgres compatability - found mysql specific things in other controller files. I'm currently working on them at my own lobsters fork at https://github.com/seltzered/journaltalk - postgres fixes should be commited on there in the coming day or two.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Groovy withBatch is really slow - mysql

You're running tons of queries on each and every update. You'd be better off retrieving a list of data and then looping over that. It's not withBatch that's the bottleneck, it's your implementation.

Related

Device dependency in ZABBIX 4.2

Why is setting SQL variable is creating an error?

c++ Mysql, can not close a Mysql connection

subprocess.popen returning empty string

MySQL to Postgres conversion

Categories

Resources