I am creating a report using the following gems:
require "mysql2"
require "watir"
require "io/console"
require "writeexcel"
After I query a database with mysql2 and convert the query into a multidimensional array like so:
Mysql2::Client.default_query_options.merge!(:as => :array)
mysql = Mysql2::Client.new(:host => "01.02.03.405", :username => "user", :password => "pass123", :database => "db")
report = mysql.query("SELECT ... ASC;")
arr = []
report.each {|row| arr << row}
and then finally write the data to an Excel spreadsheet like so:
workbook = WriteExcel.new("File.xls")
worksheet = workbook.add_worksheet(sheetname = "Report")
header = ["Column A Title", ... , "Column N Title"]
worksheet.write_row("A1", header)
worksheet.write_col("A2", arr)
workbook.close
when I open the file in the latest edition of Excel for OSX (Office 365) I get the following error for every cell containing mostly numerals:
This report has a target audience that may become distracted with such an error.
I have attempted all the .set_num_format enumerable methods found in the documentation for writeexcel here.
How can I create a report with columns that contain special characters and numerals, such as currency, with write excel?
Should I look into utilizing another gem entirely?
Define the format after you create the worksheet.
format01 = workbook.add_format
format01.set_num_format('#,##0.00')
then write the column with the format.
worksheet.write_col("A2", arr, format01)
Since I'm not a Ruby user, this is just a S.W.A.G.
Related
I'm writing a Ruby script that accesses a MySQL database using the Mysql2 gem.
After I get a set of results from a query, I want to examine a subset of the rows of the result set (e.g. rows 5 to 8) rather than the whole set.
I know I can do this with a while loop, something like this:
db = Mysql2::Client.new(:host => "myserver", :username => "user", :password => "pass", :database => "books")
rs = db.query "select * from bookslist"
i = 5
while i <= 8
puts rs.entries[i]
i += 1
end
db.close
But I'm aware this is probably not the best way to write this in Ruby. How can I make this more "idiomatic" Ruby code?
(I know my other option is to modify the query to return only the data I want. I still want to know how to do this in Ruby)
Ruby provides range operator in for loop, Probably you can use following code
for i in 5..8
puts rs.entries[i]
end
so I'm working on a project that scrapes data from a website that has gun accident/death data. Here's what the website looks like: http://www.gunviolencearchive.org/officer-involved-shootings
I'm trying to grab each table row and make an object(instance?, sorry I'm new to ruby) with the data from that row and print it out into the console. Right now, the #occurances array returns an array of the same data 26 times. Clearly it is overwriting with the first row. How would you suggest that I store each of these instances?
Here is my code, the (choice) is the website address.
def self.data_from_choice(choice)
doc = Nokogiri::HTML(open(choice))
#occurances = []
doc.xpath("//tr").each do |x|
date = doc.css("td")[0].text
state = doc.css("td")[1].text
city = doc.css("td")[2].text
deaths = doc.css("td")[4].text
injured = doc.css("td")[5].text
source = doc.search(".links li.last a").attr("href").value
#occurances << {:date => date, :state => state, :city => city, :deaths => deaths, :injured => injured, :source => source}
end
puts #occurances
end
In the loop for each row you are calling doc.css(...). This causes a search from the top of the document each time (i.e. from doc). What I think you want is to make the search relative to the row, which you have in the x variable.
So change this:
date = doc.css("td")[0].text
to this
date = x.css("td")[0].text
and similarly for state, city etc.
I am currently working on a html scraper that takes a list of anime-planet url's from a text file and then loops through them, parses and stores the data in a database.
The scraper is working nicely however if I put in a large list then the chances of the url not linking to a series properly and throwing an error is quite high. I want to try make it so that IF the url does not work then it notes down the url in an array named 'error-urls' and just skips the record.
The end result being that the script finishes all working url's and returns a list of non working urls i can work with later (maybe in a text file, or just display in console).
I am currently using a rake task for this which is working quite nicely. If anyone could help me with implementing the error handling functionality it would be much appreciated. Cheers!
scrape.rake:
task :scrape => :environment do
require 'nokogiri'
require 'open-uri'
text = []
File.read("text.txt").each_line do |line|
text << line.chop
end
text.each do |series|
url = "http://www.anime-planet.com/anime/" + series
data = Nokogiri::HTML(open(url))
title = data.at_css('.theme').text
synopsis = data.at_css('.synopsis').text.strip
synopsis.slice! "Synopsis:\r\n\t\t\t\t\t"
eps = data.at_css('.type').text
year = data.at_css('.year').text
rating = data.at_css('.avgRating').text
categories = data.at_css('.categories')
genre = categories.css('li').text.to_s
image = data.at_css('#screenshots img')
imagePath = "http://www.anime-planet.com" + image['src']
anime = Series.create({:title => title, :image => imagePath, :description => synopsis, :eps => eps, :year => year, :rating => rating})
anime.tag_list = genre
anime.save()
end
end
Small example of list.txt
5-Centimeters-Per-Second
11Eyes
A-Channel
Air
Air-Gear
Aishiteru-Ze-Baby
You can use open-uri's error handling. See this for more details.
url = "http://www.anime-planet.com/anime/" + series
begin
doc = open(url)
rescue OpenURI::HTTPError => http_error
# bad status code returned
// do something here
status = http_error.io.status[0].to_i # => 3xx, 4xx, or 5xx
puts "Got a bad status code #{status}"
# http_error.message is the numeric code and text in a string
end
data = Nokogiri::HTML(doc)
I'm using Ruby's mysql2 gem found here:
https://github.com/brianmario/mysql2
I have the following code:
client = Mysql2::Client.new(
:host => dbhost,
:port => dbport, :database => dbname,
:username => dbuser,
:password => dbpass)
sql = "SELECT column1, column2, column3 FROM table WHERE id=#{id}"
res = client.query(sql, :as => :array)
p res # prints #<Mysql2::Result:0x007fa8e514b7d0>
Is it possible the above .query call to return array of hashes, each hesh in the res array to be in the format column => value. I can do this manually but from the docs I was left with the impression that I can get the results directly loaded in memory in the mentioned format. I need this, because after that I have to encode the result in json anyway, so there is no advantage for me to fetch the rows one by one. Also the amount of data is always very small.
Change
res = client.query(sql, :as => :array)
to:
res = client.query(sql, :as => :hash)
As #Tadman says, :as => :hash is the default, so actually you don't have to specify anything.
You can always fetch the results as JSON directly:
res = client.query(sql, :as => :json)
The default format, as far as I know, is an array of hashes. If you want symbol keys you need to ask for those. A lot of this is documented in the gem itself.
You should also be extremely cautious about inserting things into your query with string substitution. Whenever possible, use placeholders. These aren't supported by the mysql2 driver directly, so you should use an adapter layer like ActiveRecord or Sequel.
The source code for mysql2 implemented MySql2::Result to simply include Enumerable, so the obvious way to access the data is by using any method implemented in Enumerabledoc here.
For example, #each, #each_with_index, #collect and #to_a are all useful ways to access the Result's elements.
puts res.collect{ |row| "Then the next result was #{row}" }.join("\t")
Is anyone aware of any tutorials that demonstrate how to import data in a Ruby app with FasterCSV and saving it to a SQLite or MySQL database?
Here are the specific steps involved:
Reading a file line by line (the .foreach method does this according to documentation)
Mapping header names in file to database column names
Creating entries in database for CSV data (seems doable with .new and .save within a .foreach block)
This is a basic usage scenario but I haven't been able to find any tutorials for it, so any resources would be helpful.
Thanks!
So it looks like FasterCSV is now part of the Ruby core as of Ruby 1.9, so this is what I ended up doing, to achieve the goals in my question above:
#importedfile = Import.find(params[:id])
filename = #importedfile.csv.path
CSV.foreach(filename, {:headers => true}) do |row|
#post = Post.find_or_create_by_email(
:content => row[0],
:name => row[1],
:blog_url => row[2],
:email => row[3]
)
end
flash[:notice] = "New posts were successfully processed."
redirect_to posts_path
Inside the find_or_create_by_email function is the mapping from the database columns to the columns of the CSV file: row[0], row[1], row[2], row[3].
Since it is a find_or_create function I don't need to explicitly call #post.save to save the entry to the database.
If there's a better way please update or add your own answer.
First, start with other Stack Overflow answers: Best way to read CSV in Ruby. FasterCSV?
Before jumping into writing the code, I check whether there is an existing tool to do the import. You might want to look at mysqlimport.
This is a simple example showing how to map the CSV headers to a database's columns:
require "csv"
data = <<EOT
header1, header2, header 3
1, 2, 3
2, 2, 3
3, 2, 3
EOT
header_to_table_columns = {
'header1' => 'col1',
'header2' => 'col2',
'header 3' => 'col3'
}
arr_of_arrs = CSV.parse(data)
headers = arr_of_arrs.shift.map{ |i| i.strip }
db_cols = header_to_table_columns.values_at(*headers)
arr_of_arrs.each do |ary|
# insert into the database using an ORM or by creating insert statements
end
Ruby is great for rolling your own import routines.
Reading a file(handy block structure to ensure that the file handle is closed properly):
File.open( filepath ) do |f|
f.each_line do |line|
do something with the line...
end
end
Mapping header names to columns(you might want to check for matching array lengths):
Hash[header_array.zip( line_array )]
Creating entries in the database using activerecord:
SomeModel.create( Hash[header_array.zip( line_array )] )
It sounds like you are planning to let users upload csv files and import them into the database. This is asking for trouble unless they are savvy about data. You might want to look into a nosql solution to simplify things on the import front.
This seems to be the shortest way, if you can use the ID to identify the records and if no mapping of column names is necessary:
CSV.foreach(filename, {:headers => true}) do |row|
post = Post.find_or_create_by_id row["id"]
post.update_attributes row.to_hash
end