I'm building a reporting system for a chat made in Ruby on Rails but received some comments telling me that my approach is inefficient.
Here's a little sample of how my reports work:
I have a handler that is called each month which calls a Report mailer Like this:
ReportMailer.monthly_report(user).deliver_later
This is how the mailer looks:
class ReportMailer < ApplicationMailer
default from: ENV["DEFAULT_MAILER_FROM"],
template_path: 'mailers/report_mailer'
def monthly_report(agent)
#agent = agent
#organization = agent.organization
#report = Report.new #organization
mail(to: agent.email, subject: #report.email_subject)
end
end
I'm trying to calculate the data using a "plain old" Ruby class:
module Reports
class Component < Report
def initialize(subject)
#component = subject
#cache = {}
end
attr_reader :component
# DELEGATIONS
# -----------------------
delegate :chat_messages, to: :component
def response_count
count = 0
explore_msgs { |msg, next_msg| count += 1 if response? msg, next_msg }
return count
end
def response_time
time = 0
explore_msgs { |msg, next_msg| time += time_difference msg, next_msg if response? msg, next_msg }
return time.to_i.seconds
end
def avg_response_time
#cache[__method__] ||= (response_time / response_count if response_count > 0)
end
private
def response?(msg, next_msg)
next_msg&.user_type == 'Agent' && msg.user_type == 'User' && msg.conversation_id == next_msg.conversation_id && time_difference(msg, next_msg).seconds < 8.hours
end
def time_difference(msg, next_msg)
(next_msg.created_at - msg.created_at).abs
end
def explore_msgs
chat_messages.each_with_index do |msg, i|
next_msg = chat_messages[i+1]
yield msg, next_msg
end
end
end
end
I'm concerned with improving performance. I implemented a simple caching system into the class in charge of making the calculations which made huge improvements in the system efficiency, however, I'm concerned that making these calculations in Ruby might create bottlenecks or that it might not be a scalable solution.
It could be faster. The problem I see is that you are looking a one record and the following record. So how would you get the database to compare the two records?
In straight SQL, I would join the table to itself, group by the first instance of the table and do a min(created_at) on the second instance of the table.
Using our companies table, the SQL looks like this:
select rc1.id, rc1.created_at, min(rc2.created_at)
from companies rc1 inner join companies rc2 on rc1.created_at < rc2.created_at
group by rc1.id
You can add the difference to the select.
This will be certainly be slow if the created_at field is not indexed and the number of records in the table is large.
You can add the test for Agent and User to the having clause.
The query is tricky and the database might not be able to do this fast. It will also be tricky if you try to get ActiveRecord to build the query for you.
However, I think everything you are trying to do in your code can be done this way by the database.
Your query might look like this:
select chat_messages.*,
min(next_msg.created_at) as next_created_at,
next_msg.created_at - chat_messages.created_at as created_at_diff
from chat_messages inner join chat_messages next_msg
on chat_messages.created_at < chat_messages.created_at
and chat_messages.user_type = 'User'
group by chat_messages.id
having next_msg.user_type = 'Agent'
and TIMESTAMPDIFF(HOUR, min(next_msg.created_at), chat_messages.created_at) < 8
Im trying to replicate the searching list style of crunchbase using ruby on rails.
I have an array of filters that looks something like this:
[
{
"id":"0",
"className":"Company",
"field":"name",
"operator":"starts with",
"val":"a"
},
{
"id":"1",
"className":"Company",
"field":"hq_city",
"operator":"equals",
"val":"Karachi"
},
{
"id":"2",
"className":"Category",
"field":"name",
"operator":"does not include",
"val":"ECommerce"
}
]
I send this json string to my ruby controller where I have implemented this logic:
filters = params[:q]
table_names = {}
filters.each do |filter|
filter = filters[filter]
className = filter["className"]
fieldName = filter["field"]
operator = filter["operator"]
val = filter["val"]
if table_names[className].blank?
table_names[className] = []
end
table_names[className].push({
fieldName: fieldName,
operator: operator,
val: val
})
end
table_names.each do |k, v|
i = 0
where_string = ''
val_hash = {}
v.each do |field|
if i > 0
where_string += ' AND '
end
where_string += "#{field[:fieldName]} = :#{field[:fieldName]}"
val_hash[field[:fieldName].to_sym] = field[:val]
i += 1
end
className = k.constantize
puts className.where(where_string, val_hash)
end
What I do is, I loop over the json array and create a hash with keys as table names and values are the array with the name of the column, the operator and the value to apply that operator on. So I would have something like this after the table_names hash is created:
{
'Company':[
{
fieldName:'name',
operator:'starts with',
val:'a'
},
{
fieldName:'hq_city',
operator:'equals',
val:'karachi'
}
],
'Category':[
{
fieldName:'name',
operator:'does not include',
val:'ECommerce'
}
]
}
Now I loop over the table_names hash and create a where query using the Model.where("column_name = :column_name", {column_name: 'abcd'}) syntax.
So I would be generating two queries:
SELECT "companies".* FROM "companies" WHERE (name = 'a' AND hq_city = 'b')
SELECT "categories".* FROM "categories" WHERE (name = 'c')
I have two problems now:
1. Operators:
I have many operators that can be applied on a column like 'starts with', 'ends with', 'equals', 'does not equals', 'includes', 'does not includes', 'greater than', 'less than'. I am guessing the best way would be to do a switch case on the operator and use the appropriate symbol while building the where string. So for example, if the operator is 'starts with', i'd do something like where_string += "#{field[:fieldName]} like %:#{field[:fieldName]}" and likewise for others.
So is this approach correct and is this type of wildcard syntax allowed in this kind of .where?
2. More than 1 table
As you saw, my approach builds 2 queries for more than 2 tables. I do not need 2 queries, I need the category name to be in the same query where the category belongs to the company.
Now what I want to do is I need to create a query like this:
Company.joins(:categories).where("name = :name and hq_city = :hq_city and categories.name = :categories[name]", {name: 'a', hq_city: 'Karachi', categories: {name: 'ECommerce'}})
But this is not it. The search can become very very complex. For example:
A Company has many FundingRound. FundingRound can have many Investment and Investment can have many IndividualInvestor. So I can select create a filter like:
{
"id":"0",
"className":"IndividualInvestor",
"field":"first_name",
"operator":"starts with",
"val":"za"
}
My approach would create a query like this:
SELECT "individual_investors".* FROM "individual_investors" WHERE (first_name like %za%)
This query is wrong. I want to query the individual investors of the investments of the funding round of the company. Which is a lot of joining tables.
The approach that I have used is applicable to a single model and cannot solve the problem that I stated above.
How would I solve this problem?
You can create a SQL query based on your hash. The most generic approach is raw SQL, which can be executed by ActiveRecord.
Here is some concept code that should give you the right idea:
query_select = "select * from "
query_where = ""
tables = [] # for selecting from all tables
hash.each do |table, values|
table_name = table.constantize.table_name
tables << table_name
values.each do |q|
query_where += " AND " unless query_string.empty?
query_where += "'#{ActiveRecord::Base.connection.quote(table_name)}'."
query_where += "'#{ActiveRecord::Base.connection.quote(q[fieldName)}'"
if q[:operator] == "starts with" # this should be done with an appropriate method
query_where += " LIKE '#{ActiveRecord::Base.connection.quote(q[val)}%'"
end
end
end
query_tables = tables.join(", ")
raw_query = query_select + query_tables + " where " + query_where
result = ActiveRecord::Base.connection.execute(raw_query)
result.to_h # not required, but raw results are probably easier to handle as a hash
What this does:
query_select specifies what information you want in the result
query_where builds all the search conditions and escapes input to prevent SQL injections
query_tables is a list of all the tables you need to search
table_name = table.constantize.table_name will give you the SQL table_name as used by the model
raw_query is the actual combined sql query from the parts above
ActiveRecord::Base.connection.execute(raw_query) executes the sql on the database
Make sure to put any user submitted input in quotes and escape it properly to prevent SQL injections.
For your example the created query will look like this:
select * from companies, categories where 'companies'.'name' LIKE 'a%' AND 'companies'.'hq_city' = 'karachi' AND 'categories'.'name' NOT LIKE '%ECommerce%'
This approach might need additional logic for joining tables that are related.
In your case, if company and category have an association, you have to add something like this to the query_where
"AND 'company'.'category_id' = 'categories'.'id'"
Easy approach: You can create a Hash for all pairs of models/tables that can be queried and store the appropriate join condition there. This Hash shouldn't be too complex even for a medium-sized project.
Hard approach: This can be done automatically, if you have has_many, has_one and belongs_to properly defined in your models. You can get the associations of a model using reflect_on_all_associations. Implement a Breath-First-Search or Depth-First Search algorithm and start with any model and search for matching associations to other models from your json input. Start new BFS/DFS runs until there are no unvisited models from the json input left. From the found information, you can derive all join conditions and then add them as expressions in the where clause of the raw sql approach as explained above. Even more complex, but also doable would be reading the database schema and using a similar approach as defined here by looking for foreign keys.
Using associations: If all of them are associated with has_many / has_one, you can handle the joins with ActiveRecord by using the joins method with inject on the "most significant" model like this:
base_model = "Company".constantize
assocations = [:categories] # and so on
result = assocations.inject(base_model) { |model, assoc| model.joins(assoc) }.where(query_where)
What this does:
it passes the base_model as starting input to Enumerable.inject, which will repeatedly call input.send(:joins, :assoc) (for my example this would do Company.send(:joins, :categories) which is equivalent to `Company.categories
on the combined join, it executes the where conditions (constructed as described above)
Disclaimer The exact syntax you need might vary based on the SQL implementation you use.
Full blown SQL string is a security issue, because it exposes your application to a SQL injection attack. If you can get your way around this, it is completely ok to make those query concatenations, as long as you make them compatible with your DB(yes, this solution is DB specific).
Other than that you can make some field that marks some querys as joined, as I have mentioned in the comment, you would have some variable to mark the desired table to be the output of the query, something like:
[
{
"id":"1",
"className":"Category",
"field":"name",
"operator":"does not include",
"val":"ECommerce",
"queryModel":"Company"
}
]
Which, when processing the query, you would use to output the result of this query as the queryModel instead of the className, in those cases the className would be used only to join the table conditions.
I would suggest altering your JSON data. Right now you only send name of the model, without the context, it would be easier if your model would have context.
In your example data would have to look like
data = [
{
id: '0',
className: 'Company',
relation: 'Company',
field: 'name',
operator: 'starts with',
val: 'a'
},
{
id: '1',
className: 'Category',
relation: 'Company.categories',
field: 'name',
operator: 'equals',
val: '12'
},
{
id: '3',
className: 'IndividualInvestor',
relation: 'Company.founding_rounds.investments.individual_investors',
field: 'name',
operator: 'equals',
val: '12'
}
]
And you send this data to QueryBuilder
query = QueryBuilder.new(data)
results = query.find_records
Note: find_records returns array of hashes per model on which you execute query.
For example it would return [{Company: [....]]
class QueryBuilder
def initialize(data)
#data = prepare_data(data)
end
def find_records
queries = #data.group_by {|e| e[:model]}
queries.map do |k, v|
q = v.map do |f|
{
field: "#{f[:table_name]}.#{f[:field]} #{read_operator(f[:operator])} ?",
value: value_based_on_operator(f[:val], f[:operator])
}
end
db_query = q.map {|e| e[:field]}.join(" AND ")
values = q.map {|e| e[:value]}
{"#{k}": k.constantize.joins(join_hash(v)).where(db_query, *values)}
end
end
private
def join_hash(array_of_relations)
hash = {}
array_of_relations.each do |f|
hash.merge!(array_to_hash(f[:joins]))
end
hash.map do |k, v|
if v.nil?
k
else
{"#{k}": v}
end
end
end
def read_operator(operator)
case operator
when 'equals'
'='
when 'starts with'
'LIKE'
end
end
def value_based_on_operator(value, operator)
case operator
when 'equals'
value
when 'starts with'
"%#{value}"
end
end
def prepare_data(data)
data.each do |record|
record.tap do |f|
f[:model] = f[:relation].split('.')[0]
f[:joins] = f[:relation].split('.').drop(1)
f[:table_name] = f[:className].constantize.table_name
end
end
end
def array_to_hash(array)
if array.length < 1
{}
elsif array.length == 1
{"#{array[0]}": nil}
elsif array.length == 2
{"#{array[0]}": array[1]}
else
{"#{array[0]}": array_to_hash(array.drop(1))}
end
end
end
I feel you are over complicating things by having one single controller for everything. I would create a controller for every model or entity that you would want to show and then implement the filters like you said.
Implementing a dynamic where and order by is not very hard but if, as you said, you need to have also the logic to implement some joins you are not only over complicating the solution (because you will have to keep this controller updated every time you add a new model, entity or change the basic logic) but you are also enabling people start playing with your data.
I am not very familiar with Rails so sadly I cannot give you any specific cde other than saying that your approach seems OK to me. I would explode it into multiple controllers.
I am created Rails 5 application with MySql as backend. I have "Server" model with "disk_size" field and disk_size field is varchar and values stored like "10 GB", "512 MB". I want result with order data by disk_size field in ascending order like "512 MB", "10 GB".
I think this will generate a lot of overflow in calculating the sizes from the strings. Plus you can't use sql to order these which would be way faster.
A better way would store them in bytes so it will be an integer which can be sorted.
If you then want to print out the size you can use the number_to_human_size helper. See the docs
E.g.
irb(main):003:0> size = 10_000_000
=> 10000000
irb(main):004:0> number_to_human_size(size)
=> "10 MB"
EDIT:
According to Andrey Deineko, this was not an answer.
So here's the ruby way (without a very very very long sql query to solve this):
Build a class named "FileSize" for example (You can put that in app/models/):
class FileSize
include Comparable
UNITS = {
"MB" => 1_000_000,
"GB" => 1_000_000_000,
# ...
}
def initialize(str)
#str = str
end
def to_bytes
count, unit = #str.split(" ")
if UNITS.key?(unit)
count.to_i * UNITS[unit]
else
raise "Don't know unit #{unit.inspect}, please specify."
end
end
def <=>(other)
to_bytes <=> other.to_bytes
end
def inspect
#str
end
def to_s
#str
end
end
complete the UNITS constant with all the units you have.
If you're using ActiveRecord you can overwrite your getter of your model:
class MyModel < ApplicationRecord
def disk_size
FileSize.new(super)
end
end
So you can do this:
MyModel.where(something: "is").sort_by(&:disk_size)
So I'm trying to write a method in the model that will allow me to return posts who have a specific field value that is greater than 0.
So I have posts that have fields that are essentially tags. Basically I post has four fields, hiphop, electro, house and pop. Each field has a value between 0 and 10.
I'm trying to make it so if someone clicks on a button the the view that says "Hip Hop" it will return all posts that have a hiphop field value that is greater than 0.
I know this is wrong but I'm thinking something like this
def self.tagSearch(query)
where("#{query} > 0")
end
and in my controller I would have something like this
def index
if params[:search]
#songs = Song.search(params[:search]).order("created_at DESC")
elsif params[:tag]
#songs = Songs.tagSearch(params[:tag]).order("created_at DESC")
else
#songs = Song.all
end
end
And I'm not sure about the view but maybe a button that passes the tag value parameter. The thing is I just want it to be a button, I don't need them to input anything.
I hope this isn't too confusing.
Thank you!
Matt
To expand on RaVen post:
1) Use ruby naming conventions tagSearch should be tag_search; methods are snake case (lower case with underscores).
2) where("#{query} > 0") is exposing you to SQL injection attacks - recommended to install the brakeman gem which can expose security issues like this:
http://brakemanscanner.org/docs/warning_types/sql_injection/
http://brakemanscanner.org/docs/
3) You can simplify your code by chaining scopes, scopes that return nil will not effect the query
class Song
scope :search, -> (query) do
where("name LIKE ?", "#{query}%") if query.present?
end
scope :tag_search, -> (tag) do
where(tag > 0) if tag.present?
end
scope :ordered, -> do
order(created_at: :desc)
end
end
class SongsController
def index
#songs = Song.search(params[:search])
.tag_search(params[:tag])
.ordered
end
end
4) Making queries based on a user specified column and avoiding sql injection:
This is one way to do it, there are probably other better ones available, like using the models arel_table, anyhow this one is pretty straight forward
scope :tag_search, -> (tag) do
where("#{self.white_list(tag)} > 0") if tag.present?
end
def self.white_list(column_name)
# if the specified column_name matches a model attribute then return that attribute
# otherwise return nil which will cause a sql error
# but it won't let arbitrary sql execution
self.attribute_names.detect { |attribute| attribute == column_name }
end
Rails support "scopes" which return an ActiveRecord::Relation which means you can chain them together.
class Song
scope :tag_search, -> (something) { where(something > 0) }
scope :ordered, -> { order(created_at: :desc) }
end
class SongsController
def index
if params[:search]
#songs = Song.search(params[:search]).ordered
elsif params[:tag]
#songs = Songs.tag_search(params[:tag]).ordered
else
#songs = Song.all
end
end
end
I would overthink the design of this.
Plus your tagSearch function is really dangerous. SQL INJECTION!
I have a mysql query that returns this type of data:
{"id"=>1, "serviceCode"=>"1D00", "price"=>9.19}
{"id"=>2, "serviceCode"=>"1D01", "price"=>9.65}
I need to return the id field based on a match of the serviceCode.
i.e. I need a method like this
def findID(serviceCode)
find the row that has the service code and return the ID
end
I was thinking of having a serviceCodes.each do |row| method and loop through and essentially go
if row == serviceCode
return row['id']
end
is there a faster / easier way?
You can use the method Enumerable#find:
service_codes = [
{"id"=>1, "serviceCode"=>"1D00", "price"=>9.19},
{"id"=>2, "serviceCode"=>"1D01", "price"=>9.65}
]
service_codes.find { |row| row['serviceCode'] == '1D00' }
# => {"id"=>1, "serviceCode"=>"1D00", "price"=>9.19}
If you use Rails Active Record as ORM and your Model named Product (only for example),
you can use something like this:
def findID(serviceCode)
Product.select(:id).where(serviceCode: serviceCode).first
end
If you have plain SQL Query in plain ruby class (not recommended), you should change this query to get only the id, as Luiggi mentioned. But aware of SQL Injections if your serviceCode coming from external Requests.