Ruby DBI fetch can't handle all-zero dates in MySQL - mysql

I'm trying to access our database with some fairly simple CGI/Ruby and DBI:
#!/usr/bin/ruby -w
require "dbi"
dbh = DBI.connect("dbi:Mysql:my:mydb", "XXXX", "XXXX")
...
query = "select ETA from mydb.mytable where ID = #{someval}"
rows = dbh.execute(query)
while row = rows.fetch() do
# Do some stuff
...
end
That works fine most of the time, but I hit a record which broke it with the error:
/usr/lib/ruby/vendor_ruby/dbd/Mysql.rb:120:in `parse': invalid date (ArgumentError)
from /usr/lib/ruby/vendor_ruby/dbd/Mysql.rb:120:in `parse'
from /usr/lib/ruby/vendor_ruby/dbi/row.rb:66:in `block in convert_types'
from /usr/lib/ruby/vendor_ruby/dbi/row.rb:65:in `each'
from /usr/lib/ruby/vendor_ruby/dbi/row.rb:65:in `each_with_index'
from /usr/lib/ruby/vendor_ruby/dbi/row.rb:65:in `convert_types'
from /usr/lib/ruby/vendor_ruby/dbi/row.rb:75:in `set_values'
from /usr/lib/ruby/vendor_ruby/dbi/handles/statement.rb:226:in `fetch'
from /usr/lib/cgi-bin/test:39:in `block in <main>'
from /usr/lib/cgi-bin/test:36:in `each'
from /usr/lib/cgi-bin/test:36:in `<main>'
After a bit of detective work I found that it had a date of 0000-00-00 which fetch() doesn't like. Undefined dates are OK, and DBI in Perl can handle all zero dates, it's just DBI in Ruby.
I can fix the database, and I'll try to get the app which wrote the value to the database fixed too, but I think that my Ruby should be resilient to such things. Is there a way to work around this, maybe using rescue somehow?

This is the solution I came up with:
query = "select ETA from mydb.mytable where ID = #{someval}"
rows = dbh.execute(query)
begin
while row = rows.fetch() do
# Do some stuff
...
end
rescue Exception => e
puts "#{e}<br>\n"
retry
end
This works quite nicely as though the retry is starting a new while loop, rows maintains its state so the fetch resumes on the next record.
The only problem is that it's hard to identify the bad record(s). To fix that I issued more queries without the offending field. My somewhat ugly solution:
results = Hash.new
hit_error = false
query = "select ETA,UNIQUE_ID from mydb.mytable where ID = #{someval}"
rows = dbh.execute(query)
begin
while row = rows.fetch() do
# Do some stuff
...
results[row[1]] = row[0]
end
rescue Exception => e
hit_error = true
retry
end
if hit_error
query = "select UNIQUE_ID from mydb.mytable where ID = #{someval}"
rows = dbh.execute(query)
while row = rows.fetch() do
id = row[0]
unless results.has_key?(id)
begin
query = "select ETA for mydb.mytable where UNIQUE_ID = #{id} limit 1"
error = dbh.execute(query)
error.fetch() # expect this to hit same error as before
puts "Unexpected success for UNIQUE_ID #{id}<br>\n"
rescue Exception => e
puts "#{e} at UNIQUE_ID #{id}<br>\n"
end
end
end
end
Finally I'm not sure how valid it is to use DBI/Ruby, it seems deprecated. Same for MySQL/Ruby. I also tried Sequel with little success.

An alternative solution is to use fetch_hash. This fetches all data as strings, which is a little awkward as you need to convert to dates, integers, etc., but does give you the opportunity of trapping the error on the explicit conversion. This also makes it much easier to identify the bad record(s):
query = "select ETA from mydb.mytable where ID = #{someval}"
rows = dbh.execute(query)
while row = rows.fetch_hash() do
begin
eta = Date.parse row["ETA"]
# Do something with the ETA
...
rescue Exception => e
puts "Error parsing date '#{row["ETA"]}': #{e}<br>\n"
# Make do without the ETA
...
end
end

Sequel can handle invalid dates if you use the mysql adapter:
DB = Sequel.connect('mysql://user:password#host/database')
DB.convert_invalid_date_time = :string # or nil
DB.get(Sequel.cast('0000-00-00', Date))
# => "0000-00-00"

Related

Pulling an Integer value from SQL database to be used for calculation

I am having trouble just trying to pull data out of my table. I just want to pull the integer value from column Diff and add/subtract numbers to it.
Once I am done adding/subtracting, I want to update each row with the new value"
My Table chart for "users" in ruby
This is my code as of now
require 'date'
require 'mysql2'
require 'time'
def test()
connect = Mysql2::Client.new(:host => "localhost", :username => "root", :database => "rubydb")
result = connect.query("SELECT * FROM users where Status='CheckOut'")
if result.count > 0
result.each do |row|
stored_diff = connect.query("SELECT * FROM users WHERE Diff")
#stored_diff = stored_diff.to_s
puts stored_diff
end
end
end
test()
I am sure the code in the hashtag does not work since I am getting like #Mysql2::Result:0x000000000004863248 etc. Can anyone help me with this?
I have no knowledge of ruby but I'll show you the steps to achieve what you are trying based on this and this.
Get User Ids and the Diff numbers.
SELECT `Id`, `Diff` FROM users where `Status`='CheckOut'
Iterate the result.
result.each do |row|
Assign Diff and Id into variables.
usrId = #{row['Id']};
diffCal = #{row['Diff']};
Do your calculations to diffCal variable.
Execute the UPDATE query.
UPDATE `users` SET `Diff` = '#{diffCal}' WHERE `Id` = '#{usrId}'

Running out of memory when running rake import task in ruby

I am running a task to import around 1 million orders. I am looping through the data to update it to the values on the new database and it is working fine on my local computer with 8 gig of ram.
However when I upload it to my AWS instance t2.medium It will run for the first 500 thousand rows but towards the end, I will start maxing out my memory when it starts actually creating non-existent orders. I am porting a mysql database to postgres
am I missing something obvious here?
require 'mysql2' # or require 'pg'
require 'active_record'
def legacy_database
#client ||= Mysql2::Client.new(Rails.configuration.database_configuration['legacy_production'])
end
desc "import legacy orders"
task orders: :environment do
orders = legacy_database.query("SELECT * FROM oc_order")
# init progressbar
progressbar = ProgressBar.create(:total => orders.count, :format => "%E, \e[0;34m%t: |%B|\e[0m")
orders.each do |order|
if [1, 2, 13, 14].include? order['order_status_id']
payment_method = "wx"
if order['paid_by'] == "Alipay"
payment_method = "ap"
elsif order['paid_by'] == "UnionPay"
payment_method = "up"
end
user_id = User.where(import_id: order['customer_id']).first
if user_id
user_id = user_id.id
end
order = Order.create(
# id: order['order_id'],
import_id: order['order_id'],
# user_id: order['customer_id'],
user_id: user_id,
receiver_name: order['payment_firstname'],
receiver_address: order['payment_address_1'],
created_at: order['date_added'],
updated_at: order['date_modified'],
paid_by: payment_method,
order_num: order['order_id']
)
#increment progress bar on each save
progressbar.increment
end
end
end
I assume this line orders = legacy_database.query("SELECT * FROM oc_order") loads entire table to the memory, which is very ineffective.
You need to iterate over table in batches. In ActiveRecord, there is find_each method for that. You may want to implement your own batch querying using limit and offset, since you don't use ActiveRecord.
In order to handle memory efficiently, you can run mysql query in batches as suggested by nattfodd.
There are two ways to achieve it, as per mysql documentation:
SELECT * FROM oc_order LIMIT 5,10;
or
SELECT * FROM oc_order LIMIT 10 OFFSET 5;
Both of the queries will return rows 6-15.
You can decide the offset of your choice and run the queries in loop until your orders object is empty.
Let us assume you handle 1000 orders at a time, then you'll have something like this:
batch_size = 1000
offset = 0
loop do
orders = legacy_database.query("SELECT * FROM oc_order LIMIT #{batch_size} OFFSET #{offset}")
break unless orders.present?
offset += batch_size
orders.each do |order|
... # your logic of creating new model objects
end
end
It is also advised to run your code in production with proper error handling:
begin
... # main logic
rescue => e
... # handle error
ensure
... # ensure
end
Disabling row caching while iterating over the orders collection should reduce the memory consumption:
orders.each(cache_rows: false) do |order|
there is a gem that helps us do this called activerecord-import.
bulk_orders=[]
orders.each do |order|
order = Order.new(
# id: order['order_id'],
import_id: order['order_id'],
# user_id: order['customer_id'],
user_id: user_id,
receiver_name: order['payment_firstname'],
receiver_address: order['payment_address_1'],
created_at: order['date_added'],
updated_at: order['date_modified'],
paid_by: payment_method,
order_num: order['order_id']
)
end
Order.import bulk_orders, validate: false
with a single INSERT statement.

Raw SQL with "IF NOT EXISTS" doesn't execute

This is my python code. It seems that if the raw SQL contains IF NOT EXISTS, sqlalchemy will not execute it. There is no exception thrown either.
db.execute(text(
"""
IF NOT EXISTS ( select 1 from agent_assignment where exception_id = :exception_id )
BEGIN
insert into agent_assignment(exception_id, [user], work_status, insert_date, insert_user)
values (:exception_id, :user, 'pending', :date, :insert_update_user)
END
ELSE
update agent_assignment
set
[user] = :user,
update_date = :date,
update_user = :insert_update_user
where exception_id = :exception_id
"""),
exception_id = exception_id,
user = assignee,
date = datetime.now(),
insert_update_user = insert_update_user
)
If I remove the IF..ELSE part, the SQL will execute correctly. So I guess technically it is impossible to execute the raw SQL with IF..ELSE or EXISTS being a part of the statement?
What is the proper way to run raw SQL?
Thanks in advance.
I need to add COMMIt at the end of the script since the query is kinda complex and somehow sqlalchemy can't auto commit it.

Rails 3. Checking for true values in SQL

I need to check if the column exam has a value of true. So I set this up but it doesn't work...
#exam_shipments = Shipment.where("exam <> NULL AND exam <> 0 AND customer_id = ?", current_admin_user.customer_id)
# This one gives me error "SQLite3::SQLException: no such column: true:"
#exam_shipments = Shipment.where("exam = true AND customer_id = ?", current_admin_user.customer_id)
#exam_shipments = Shipment.where("exam = 1 AND customer_id = ?", current_admin_user.customer_id)
You should really just stick to AR syntax:
#exam_shipments = Shipment.where(:exam => true, :customer_id => current_admin_user.customer_id)
Assuming :exam is a boolean field on your Shipment model. ActiveRecord takes care of converting your query to the proper syntax for the given database. So the less inline SQL you write, the more database-agnostic and portable your code will be.
Why do you need do execute SQL?
It's much easier just to do
#exam_shipments = Shipment.find_by_id(current_admin_user.customer_id).exam?

MySQL error: `query': Duplicate entry '' for key 3 (Mysql2::Error) on a Ruby file

This is the code I am using
# update db
client = Mysql2::Client.new(:host => "localhost", :username => "jo151", :password => "password", :database => "jo151")
details.each do |d|
if d[:sku] != ""
price = d[:price].split
if price[1] == "D"
currency = 144
else
currency = 168
end
cost = price[0].gsub(",", "").to_f
if d[:qty] == ""
qty = d[:qty2]
else
qty = d[:qty]
end
results = client.query("SELECT * FROM jos_virtuemart_products WHERE product_sku = '#{d[:sku]}' LIMIT 1;")
if results.count == 1
product = results.first
client.query("UPDATE jos_virtuemart_products SET product_sku = '#{d[:sku]}', product_name = '#{d[:desc]}', product_desc = '#{d[:desc]}', product_in_stock = '#{qty}' WHERE virtuemart_product_id =
#{product['virtuemart_product_id']};")
client.query("UPDATE jos_virtuemart_product_prices SET product_price = '#{cost}', product_currency = '#{currency}' WHERE virtuemart_product_id = '#{product['virtuemart_product_id']}';")
else
client.query("INSERT INTO jos_virtuemart_products( product_sku, product_name, product_s_desc, product_in_stock) VALUES('#{d[:sku]}','#{d[:desc]}','#{d[:desc]}','#{d[:qty]}');")
last_id = client.last_id
client.query("INSERT INTO jos_virtuemart_product_prices(virtuemart_product_id, product_price, product_currency) VALUES('#{last_id}', '#{cost}', #{currency});")
end
end
end
`query': Duplicate entry '' for key 3 (Mysql2::Error) on line 35:
client.query("INSERT INTO jos_virtuemart_products( product_sku, product_name, product_s_desc, product_in_stock) VALUES('#{d[:sku]}','#{d[:desc]}','#{d[:desc]}','#{d[:qty]}');")
last_id = client.last_id
Putting in raw SQL statements with arbitrary strings inlined like this is extremely dangerous. You absolutely must escape any values put into them for your application to work at all. The first description you get with an apostrophe will cause your SQL to fail.
In this case you would use client.quote on each and every one of the strings. No exceptions. You have probably seen tons of press about Sony getting hacked, and it's because of mistakes like this that serious breaches happen.
You should investigate using an ORM to help with this, even something as simple as Sequel or DataMapper, as they provide facilities to make this easy.
The reason you are getting a duplicate key is because you have a unique index on one of the columns you're inserting into, or one of the columns is not specified and has a default value that collides with an existing row.