Adding numbers in several arrays together with ruby - mysql

I am using the ruby mysql2 gem to work with a database. I want to list all the countries per region in one table, then add up the hits per region.
Normally I would use the mysql SUM function, but the gem returns the headers so this is not possible.
Instead, I am getting the hit count for each country per region and add it up.
The gem returns 1 array per result and I need to get that result and add it to a running total per region.
This is my code:
#!/usr/bin/ruby -w
# simple.rb - simple MySQL script using Ruby MySQL module
require "rubygems"
require "mysql2"
client = Mysql2::Client.new(:host => "localhost", :username => "root", :database => "cbscdn")
regions = client.query("select Id from Regions")
regions.each(:as => :array) do |rid|
hit = client.query("select hits from Countries where Regions_Id='#{rid}'")
hit.each(:as => :array) do |a|
a.map do |x|
x.to_i
end
end
end
How can I implement the running count per region?

Let the database do the work:
client.query(%q{
select regions_id, sum(hits)
from countries
group by regions_id
}).each(:as => :array) do |row|
region_id, total_hits = row
#...
end
And if you want sums for regions that aren't in the countries table:
client.query(%q{
select r.id, coalesce(sum(c.hits), 0)
from regions r
left outer join countries c
on r.id = c.regions_id
group by r.id
}).each(:as => :array) do |row|
region_id, total_hits = row
#...
end

If for any reason you don't want to delegate it to the database engine, you can do this way:
(for simplicity, I assume your results are already arrays)
regions_count = regions.inject({}) do |hash, rid|
hit = client.query("whatever")
hash[rid] = hit.map(&:to_i).inject(0, :+)
hash
end

Related

Pulling an Integer value from SQL database to be used for calculation

I am having trouble just trying to pull data out of my table. I just want to pull the integer value from column Diff and add/subtract numbers to it.
Once I am done adding/subtracting, I want to update each row with the new value"
My Table chart for "users" in ruby
This is my code as of now
require 'date'
require 'mysql2'
require 'time'
def test()
connect = Mysql2::Client.new(:host => "localhost", :username => "root", :database => "rubydb")
result = connect.query("SELECT * FROM users where Status='CheckOut'")
if result.count > 0
result.each do |row|
stored_diff = connect.query("SELECT * FROM users WHERE Diff")
#stored_diff = stored_diff.to_s
puts stored_diff
end
end
end
test()
I am sure the code in the hashtag does not work since I am getting like #Mysql2::Result:0x000000000004863248 etc. Can anyone help me with this?
I have no knowledge of ruby but I'll show you the steps to achieve what you are trying based on this and this.
Get User Ids and the Diff numbers.
SELECT `Id`, `Diff` FROM users where `Status`='CheckOut'
Iterate the result.
result.each do |row|
Assign Diff and Id into variables.
usrId = #{row['Id']};
diffCal = #{row['Diff']};
Do your calculations to diffCal variable.
Execute the UPDATE query.
UPDATE `users` SET `Diff` = '#{diffCal}' WHERE `Id` = '#{usrId}'

Running out of memory when running rake import task in ruby

I am running a task to import around 1 million orders. I am looping through the data to update it to the values on the new database and it is working fine on my local computer with 8 gig of ram.
However when I upload it to my AWS instance t2.medium It will run for the first 500 thousand rows but towards the end, I will start maxing out my memory when it starts actually creating non-existent orders. I am porting a mysql database to postgres
am I missing something obvious here?
require 'mysql2' # or require 'pg'
require 'active_record'
def legacy_database
#client ||= Mysql2::Client.new(Rails.configuration.database_configuration['legacy_production'])
end
desc "import legacy orders"
task orders: :environment do
orders = legacy_database.query("SELECT * FROM oc_order")
# init progressbar
progressbar = ProgressBar.create(:total => orders.count, :format => "%E, \e[0;34m%t: |%B|\e[0m")
orders.each do |order|
if [1, 2, 13, 14].include? order['order_status_id']
payment_method = "wx"
if order['paid_by'] == "Alipay"
payment_method = "ap"
elsif order['paid_by'] == "UnionPay"
payment_method = "up"
end
user_id = User.where(import_id: order['customer_id']).first
if user_id
user_id = user_id.id
end
order = Order.create(
# id: order['order_id'],
import_id: order['order_id'],
# user_id: order['customer_id'],
user_id: user_id,
receiver_name: order['payment_firstname'],
receiver_address: order['payment_address_1'],
created_at: order['date_added'],
updated_at: order['date_modified'],
paid_by: payment_method,
order_num: order['order_id']
)
#increment progress bar on each save
progressbar.increment
end
end
end
I assume this line orders = legacy_database.query("SELECT * FROM oc_order") loads entire table to the memory, which is very ineffective.
You need to iterate over table in batches. In ActiveRecord, there is find_each method for that. You may want to implement your own batch querying using limit and offset, since you don't use ActiveRecord.
In order to handle memory efficiently, you can run mysql query in batches as suggested by nattfodd.
There are two ways to achieve it, as per mysql documentation:
SELECT * FROM oc_order LIMIT 5,10;
or
SELECT * FROM oc_order LIMIT 10 OFFSET 5;
Both of the queries will return rows 6-15.
You can decide the offset of your choice and run the queries in loop until your orders object is empty.
Let us assume you handle 1000 orders at a time, then you'll have something like this:
batch_size = 1000
offset = 0
loop do
orders = legacy_database.query("SELECT * FROM oc_order LIMIT #{batch_size} OFFSET #{offset}")
break unless orders.present?
offset += batch_size
orders.each do |order|
... # your logic of creating new model objects
end
end
It is also advised to run your code in production with proper error handling:
begin
... # main logic
rescue => e
... # handle error
ensure
... # ensure
end
Disabling row caching while iterating over the orders collection should reduce the memory consumption:
orders.each(cache_rows: false) do |order|
there is a gem that helps us do this called activerecord-import.
bulk_orders=[]
orders.each do |order|
order = Order.new(
# id: order['order_id'],
import_id: order['order_id'],
# user_id: order['customer_id'],
user_id: user_id,
receiver_name: order['payment_firstname'],
receiver_address: order['payment_address_1'],
created_at: order['date_added'],
updated_at: order['date_modified'],
paid_by: payment_method,
order_num: order['order_id']
)
end
Order.import bulk_orders, validate: false
with a single INSERT statement.

Group by one column and sum by another

Trying to write rake task that contains a query that will group by one value on a join table and then sum another column. I'd like to do it using the query interface. Purpose of this task is to find the videos that have been the most popular over the last 5 days.
In pertinent part:
course_ids = Course.where(school_id: priority_schools).pluck(:id)
sections = Section.where(course_id: course_ids)
sections.each do |section|
users = section.users.select {|user| user.time_watched > 0}
user_ids = []
users.each { |user| user_ids << user.id }
user_videos = UserVideo.group(:video_id).
select(:id, :video_id, :time_watched).
where("created_at > ?", Date.today - 5.days).
where(user_id: user_ids).sum(:time_watched)
p "user_videos: #{user_videos.inspect}"
end
Any suggestions for the how / the best way to write this query?

How to create a select where count is not zero in MySQL

Here's what I'm trying to do. I'm trying to select from a forum views table all of the user_ids where there are 5 or more records. That's fairly easy (this is Zend):
$objCountSelect = $db->select()
->from(array('v' =>'tbl_forum_views'), 'COUNT(*) AS count')
->where('u.id = v.user_id')
->having('COUNT(user_id) >= ?', 5)
;
But I need to somehow connect this to my users table. I don't want to return a result if the count is greater than 5. I tried this:
$objSelect = $db->select()
->from(array('u' => 'tbl_users'), array(
'id as u_id',
'count' => new Zend_Db_Expr('(' . $objCountSelect . ')'),
))
;
But that returns a record for every user, leaving blank the count if it's less than or equal to 5. How do I exclude the rows where the count is less than or equal to 5?
I figured it out, but wanted to post the answer in case someone else had the same issue. I added:
->having('count > 0')
to the second select and now it works.

Convert this code into active record/sql query

I have the following code and would like to convert the request into a mysql query. Right now I achieve the desired result using a manual .select (array method) on the data. This should be possibile with a single query (correct me if I am wrong).
Current code:
def self.active_companies(zip_code = nil)
if !zip_code
query = Company.locatable.not_deleted
else
query = Company.locatable.not_deleted.where("zip_code = ?", zip_code)
end
query.select do |company|
company.company_active?
end
end
# Check if the company can be considered as active
def company_active?(min_orders = 5, last_order_days_ago = 15)
if orders.count >= min_orders &&
orders.last.created_at >= last_order_days_ago.days.ago &&
active
return true
else
return false
end
end
Explanation:
I want to find out which companies are active. We have a company model and an orders model.
Data:
Company:
active
orders (associated orders)
Orders:
created_at
I don't know if it is possible to make the company_active? predicate a single SQL query, but I can offer an alternative:
If you do:
query = Company.locatable.not_deleted.includes(:orders)
All of the relevant orders will be loaded into the memory for future processing.
This will eliminate all the queries except for 2:
One to get the companies, and one to get all their associated orders.