How to remove duplicates in MySQL using Rails? - mysql

I have to following table:
Relations
[id,user_id,status]
1,2,sent_reply
1,2,sent_mention
1,3,sent_mention
1,4,sent_reply
1,4,sent_mention
I am looking for a way to remove duplicates, so that only the following rows will remain:
1,2,sent_reply
1,3,sent_mention
1,4,sent_reply
(Preferably using Rails)

I know this is way late, but I found a good way to do it using Rails 3. There are probably better ways, though, and I don't know how this will perform with 100,000+ rows of data, but this should get you on the right track.
# Get a hash of all id/user_id pairs and how many records of each pair
counts = ModelName.group([:id, :user_id]).count
# => {[1, 2]=>2, [1, 3]=>1, [1, 4]=>2}
# Keep only those pairs that have more than one record
dupes = counts.select{|attrs, count| count > 1}
# => {[1, 2]=>2, [1, 4]=>2}
# Map objects by the attributes we have
object_groups = dupes.map do |attrs, count|
ModelName.where(:id => attrs[0], :user_id => attrs[1])
end
# Take each group and #destroy the records you want.
# Or call #delete instead to save time if you don't need ActiveRecord callbacks
# Here I'm just keeping the first one I find.
object_groups.each do |group|
group.each_with_index do |object, index|
object.destroy unless index == 0
end
end

It is better to do it through SQL. But if you prefer to use Rails:
(Relation.all - Relation.all.uniq_by{|r| [r.user_id, r.status]}).each{ |d| d.destroy }
or
ids = Relation.all.uniq_by{|r| [r.user_id, r.status]}.map(&:id)
Relation.where("id IS NOT IN (?)", ids).destroy_all # or delete_all, which is faster
But I don't like this solution :D

Related

Ruby using a date as key in a hash from MySQL

I have a hash which is the result of a .map method on a MySQL2::Result object which looks like this:
{#<Date: 2018-01-02 ((2458121j,0s,0n),+0s,2299161j)>=>"OL,BD,DM,WW,DG"}
{#<Date: 2018-01-03 ((2458122j,0s,0n),+0s,2299161j)>=>"KP,LW"}
{#<Date: 2018-01-04 ((2458123j,0s,0n),+0s,2299161j)>=>"LW,WW,FS,DG"}
{#<Date: 2018-01-05 ((2458124j,0s,0n),+0s,2299161j)>=>"OL,KP,BD,SB,LW,DM,AS,WW,FS,DG"}
{#<Date: 2018-01-06 ((2458125j,0s,0n),+0s,2299161j)>=>"OL,KP,BD,SB,LW,DM,AS,WW,FS,DG"}
I would like to pull the values (the two letter items) from the hash, by referencing with the key.
I have tried
puts hash_name["2018-01-06"]
puts hash_name['2018-01-06']
puts hash_name[Date.new(2018,1,6)]
puts hash_name["<Date: 2018-01-06 ((2458125j,0s,0n),+0s,2299161j)>"]
puts hash_name["#<Date: 2018-01-06 ((2458125j,0s,0n),+0s,2299161j)>"]
All return nothing or an error.
The hash is created by doing the following:
hash_name = #available_items.map do
|h| {h["tdate"] => h["items"] }
end
Is there something I can do during the creation of the hash, or now, to be able to easily pull the value out using e.g. can I convert it to some other date format like ISO format?
Thanks
I think your problem is that Enumerable#map doesn't do what you think it does. This:
hash_name = #available_items.map do
|h| {h["tdate"] => h["items"] }
end
will give you an array of single entry hashes, the individual hashes will map Dates to strings but the result looks like:
[
{ date1 => string1 },
{ date2 => string2 },
...
]
rather than:
{
date1 => string1,
date2 => string2,
...
}
as you're expecting. Switching to #each_with_object should take care of your problem:
hash_name = #available_items.each_with_object({}) do |row, h|
h[row['tdate']] = row['items']
end
You're close here, but you're generating an array of hashes, not a singular hash:
hash_name = #available_items.map do |i|
[ i["tdate"], i["items"] ]
end.to_h
This creates an array of key/value pair arrays, then converts them to a hash with the .to_h method.
You can also use group_by if your input data can be grouped neatly, like:
hash_name = #available_items.group_by do |i|
i['tdate']
end
Where that approach might be good enough if can deal with the output format. It's keyed by date.
Note that using symbol keys like :tdate and :items is usually preferable to string keys. It's worth trying to steer towards that in most cases where there'd otherwise be rampant repetition of those strings.
In the hopes that this may help others to do a similar thing, here is what I ended up doing.
I have a MySQL2::Result object as shown above, on which I run:
#available_hash = #available_items.map do |row|
[ row["tdate"], row["available"] ]
end.to_h
Having previously declared a start_date and an end_date I then select an available item from the list, at random to fill a new hash using the dates as keys:
$final_hash = Hash.new("")
for date in (start_date..end_date)
#available_today = #available_hash[date].to_s.split(",")
$final_hash[date] = random_item(#available_today)
date +=1;
end
Whilst I am sure there is probably a more elegant way of doing this, I am delighted that you have helped me to get this to work!
Obviously hash map is not suitable for a date as the key, hash map is more suitable for key as id, tag, etc. It should be a unique key.
Please provide more information about what you need to do with this hash map, for sure you can have some more clever data structure.
If you have an array with two keys (tdate, items) and you want to lookup for the date just use select:
result = available_items.select { |elem| elem['tdate'] === Date.new(2001,2,3) }
reference for '===' operator in Date class
http://ruby-doc.org/stdlib-2.1.1/libdoc/date/rdoc/Date.html#method-i-3D-3D-3D

Sorting users in Rails based on values in array attribute of user model

I have a User model which has an array inside of it. This array is used to store points the user has scored in various activities. It basically looks like this:
<ActiveRecord::Relation [#<User id: 1, fullname: "Kaja Sunniva Edvardsen", points: [0, 4170, 3860, 2504, 2971, 3859, 4346]>, #<User id: 2, fullname: "Alexander Lie Sr.", points: [0, 3273, 3681, 2297, 2748, 4202, 3477]>]>
I want to sort all Users by the different values in the points array to be able to create ranking list for each of the different activities, points[0], points[1], etc...
Sorting by points[1] should return Kaja first, 4170>3273, sorting by points[6] should put Alexander first, 4202>3859
How do I do this?
As far as I know, MySQL does not have an integrated array type.
Assuming you have a model like this:
class User < ActiveRecord::Base
# ...
serialize :points, Array
# ...
end
You cannot sort with order queries, but you can try another solution (less efficient), handling the resources as an array:
User.all.sort { |user1, user2| user2.points[1] <=> user1.points[1] }
Which will return an array instead of an ActiveRecord query. Also, bear in mind that this code will not handle nil values (i.e. What if an user only have 2 elements in points?).

sinatra +Datamapper + mysql

I am using Sinatra and DataMapper with MySQL and i getting issues when i query the database.
My models.rb is the folloging:
require 'sinatra'
require 'dm-core'
require 'dm-migrations/adapters/dm-mysql-adapter'
DataMapper::Logger.new("log/datamapper.log", :debug)
DataMapper.setup(:default, 'mysql://user:password#localhost/testdb')
class Item
include DataMapper::Resource
property :id, Serial
property :item, String, :length => 50
end
DataMapper.finalize
DataMapper.auto_upgrade!
Item.create(item:"item_one")
Item.create(item:"item_two")
The items are inserted in the database but when i query de database always returns nil values, example:
(rdb:1) #items =Item.all
[#<Item #id=nil #item=nil>, #<Item #id=nil #item=nil>]
if i query the numbers of items i get the expected result:
(rdb:1) #items.count
2
I have tried to make a query directly getting the same result :
adapter = DataMapper.repository(:default).adapt
adapter.select("SELECT * FROM items")
Does anyone know what I'm doing wrong or have suggestions on what to look for to fix problem?
Add these two lines to models.rb:
adapter = DataMapper.repository(:default).adapter
print adapter.select("SELECT * FROM items")
(Notice .adapter, not .adapt.) It prints
[#<struct id=1, item="item_one">, #<struct id=2, item="item_two">]
Everything works as expected (ruby 2.1.7p400 (2015-08-18 revision 51632)).

Find table in an array with the most rows using Ruby, Nokogiri and Mechanize

#p = mechanize.get(url)
tables = #p.search('table.someclass')
I'm basically going over about 200 pages, putting the tables in an array and the only way to sort is to find the table with the greatest number of rows.
So I want to be able to look at each item in the array and select the first item with the greatest number of rows.
I've been trying to use max_by but that won't work because I'm needing to search the table that is the array item, to find the tr.count.
Two ways:
biggest = tables.max_by{ |table| table.css('tr').length }
biggest = tables.max_by{ |table| table.xpath('.//tr').length }
Since you didn't give an example URL, here's a similar search showing that max_by can be used:
require 'mechanize'
mechanize = Mechanize.new
rows = mechanize.get("http://phrogz.net/").search('table#infoporn tbody tr')
# Find the table row from the array that has the longest link URL in it
biggest = rows.max_by{ |tr| tr.at_xpath('.//a/#href').text.length }
p biggest.name, biggest.at('.//a/#href')
#=> "tr"
#=> [#<Nokogiri::XML::Attr:0x1681680 name="href" value="slow-file-reads-on-windows-ruby-1.9">]

transpose a html table

Is it possible to transpose an html table (without javascript).
I m generating a table with rails (and erb) from a list of object. So it's really easy and natural to do it when each row correspond to one object. However , I need each object to be represented as a column. I would like to have only one loop and describe each column rather than doing the same loop for every columns. (That doesn't necessarily needs to be a real table , could be a list or anything which does the trick).
update
To clarify the question. I don't want to transpose an array in ruby, but to display a html table with the row vertically. My actual table is actually using one partial per row, wich generate a list of cell (td). That can be change to a list if that help. Anyway this is HTML question not a ruby one : how to display a table with the rows vertically (rather than horizontally).
You may need something like this?
class Array
def transpose
# Check here if self is transposable (e.g. array of hashes)
b = Hash.new
self.each_index {|i| self[i].each {|j, a_ij| b[j] ||= Array.new; b[j][i] = a_ij}}
return b
end
end
a = [{:a => 1, :b => 2, :c => 3}, {:a => 4, :b => 5, :c => 6}]
a.transpose #=> {:a=>[1, 4], :b=>[2, 5], :c=>[3, 6]}
Apparently, the answer is no :-(