Is there a way to do 'hot code reloading' with a Rails application in the development environment?
For example: I'm working on a Rails application, I add a few lines of css in a stylesheet, I look at the browser to see the modified styling. As of right now I have to refresh the page with cmd-r or by clicking the refresh button.
Is there a way to get the page to reload automatically when changes are made?
This works nicely in the Phoenix web framework (and I'm sure Phoenix isn't the only framework in this feature). How could a feature like this be enabled in Ruby on Rails?
I am using this setup reloads all assets, js, css, ruby files
in Gemfile
group :development, :test do
gem 'guard-livereload', '~> 2.5', require: false
end
group :development do
gem 'listen'
gem 'guard'
gem 'guard-zeus'
gem 'rack-livereload'
end
insert this in your development.rb
config.middleware.insert_after ActionDispatch::Static, Rack::LiveReload
i have this in my guard file
# A sample Guardfile
# More info at https://github.com/guard/guard#readme
## Uncomment and set this to only include directories you want to watch
# directories %w(app lib config test spec features) \
# .select{|d| Dir.exists?(d) ? d : UI.warning("Directory #{d} does not exist")}
## Note: if you are using the `directories` clause above and you are not
## watching the project directory ('.'), then you will want to move
## the Guardfile to a watched dir and symlink it back, e.g.
#
# $ mkdir config
# $ mv Guardfile config/
# $ ln -s config/Guardfile .
#
# and, you'll have to watch "config/Guardfile" instead of "Guardfile"
guard 'livereload' do
extensions = {
css: :css,
scss: :css,
sass: :css,
js: :js,
coffee: :js,
html: :html,
png: :png,
gif: :gif,
jpg: :jpg,
jpeg: :jpeg,
# less: :less, # uncomment if you want LESS stylesheets done in browser
}
rails_view_exts = %w(erb haml slim)
# file types LiveReload may optimize refresh for
compiled_exts = extensions.values.uniq
watch(%r{public/.+\.(#{compiled_exts * '|'})})
extensions.each do |ext, type|
watch(%r{
(?:app|vendor)
(?:/assets/\w+/(?<path>[^.]+) # path+base without extension
(?<ext>\.#{ext})) # matching extension (must be first encountered)
(?:\.\w+|$) # other extensions
}x) do |m|
path = m[1]
"/assets/#{path}.#{type}"
end
end
# file needing a full reload of the page anyway
watch(%r{app/views/.+\.(#{rails_view_exts * '|'})$})
watch(%r{app/helpers/.+\.rb})
watch(%r{config/locales/.+\.yml})
end
guard 'zeus' do
require 'ostruct'
rspec = OpenStruct.new
# rspec.spec_dir = 'spec'
# rspec.spec = ->(m) { "#{rspec.spec_dir}/#{m}_spec.rb" }
# rspec.spec_helper = "#{rspec.spec_dir}/spec_helper.rb"
# matchers
# rspec.spec_files = /^#{rspec.spec_dir}\/.+_spec\.rb$/
# Ruby apps
ruby = OpenStruct.new
ruby.lib_files = /^(lib\/.+)\.rb$/
# watch(rspec.spec_files)
# watch(rspec.spec_helper) { rspec.spec_dir }
# watch(ruby.lib_files) { |m| rspec.spec.call(m[1]) }
# Rails example
rails = OpenStruct.new
rails.app_files = /^app\/(.+)\.rb$/
rails.views_n_layouts = /^app\/(.+(?:\.erb|\.haml|\.slim))$/
rails.controllers = %r{^app/controllers/(.+)_controller\.rb$}
# watch(rails.app_files) { |m| rspec.spec.call(m[1]) }
# watch(rails.views_n_layouts) { |m| rspec.spec.call(m[1]) }
# watch(rails.controllers) do |m|
# [
# rspec.spec.call("routing/#{m[1]}_routing"),
# rspec.spec.call("controllers/#{m[1]}_controller"),
# rspec.spec.call("acceptance/#{m[1]}")
# ]
# end
end
I am using zeus instead of spring on this setup.
Run guard
Open localhost:3000 and you are good to go.
This should resolve your question, and have blazing reload times better than browserify.
I commented out guard looking at test directories if you want you can uncomment those lines if your are doing TDD.
CSS hot swapping and auto-reload when HTML/JS changes can be achieved with guard in combination with livereload: https://github.com/guard/guard-livereload
This gem would auto reload when you make changes to js elements(Not css or ruby files).
https://github.com/rmosolgo/react-rails-hot-loader
Never seen css hot code reloading in rails platform.
I am new to RoR and I am following the Agile Web Development with Rails 4 book and I am to the point where we add CSS styling to the store front page (page 135 "8.1 Iteration C1: Creating the Catalog Listing").
However, I did exactly as the book stated but the CSS style is not being loaded when I load the page.
I did some research and from what I understand, everything is correct.
Does anyone see anything wrong or what I need to add?
I have tried everything here but nothing worked: Agile web development with rails” book: CSS not applied
UPDATE Looks like the application is using /assets/scaffold.css.scss for styling. If I go to this file a change colors, they reflect on the StoreControllers view!!?? What is wrong here??!!
I cannot figure out why the application is using the incorrect stylesheet...
BTW I also ran RAILS_ENV=production bundle exec rake assets:precompile and made sure the assets were included in public/assets/ folder
Here is my code:
Depot/app/assets/stylesheets/store.css.scss
.store {
h1 {
margin:0;
padding-bottom: 0.5em;
font: 150% san-serif;
color: #226;
border-bottom: 3px dotted #77d;
}
.entry {
overflow: auto;
margin-top: 1em;
border-bottom: 1px dotted #77d;
min-hieght:100px;
img {
width: 80px;
margin-right: 5px;
margin-bottom: 5px;
position: absolute;
}
h3 {
font-size: 120%;
font-family: sans-serif;
margin-left: 100px;
margin-top: 0;
margin-bottom: 2px;
color: #227;
}
p,div.price_line{
margin-left: 100px;
margin-top: 0.5em;
margin-bottom: 0.8em;
}
.price{
color: #44a;
font-weight: bold;
margin-right: 3em;
}
}
}
// Place all the styles related to the Store controller here.
// They will automatically be included in application.css.
// You can use Sass (SCSS) here: http://sass-lang.com/
Depot/app/views/layouts/application.html.erb
<!DOCTYPE html>
<html>
<head>
<title>Application</title>
<%= stylesheet_link_tag "application", media: "all", "data-turbolinks-track" => true %>
<%= javascript_include_tag "application", "data-turbolinks-track" => true %>
<%= csrf_meta_tags %>
</head>
<body class='<%= controller.controller_name%>'>
<%= yield %>
</body>
</html>
Depot/app/controllers/store/store_controller.rb
class StoreController < ApplicationController
def index
#products = Product.order(:title)
end
end
Depot/app/views/store/index.html.erb
<% if notice %>
<p id= "notice" ><%= notice %></p>
<% end %>
<h1>Your Pragmatic Catalog</h1>
<% #products.each do |product| %>
<div class= ".entry" >
<%= image_tag(product.image_url)%>
<h3><%= product.title %></h3>
<%= sanitize(product.description)%>
<div class= "price_line" >
<span class= "price" ><%= product.price %></span>
</div>
</div>
<% end %>
Depot/app/assets/stylesheets/application.css
/*
* This is a manifest file that'll be compiled into application.css, which will include all the files
* listed below.
*
* Any CSS and SCSS file within this directory, lib/assets/stylesheets, vendor/assets/stylesheets,
* or vendor/assets/stylesheets of plugins, if any, can be referenced here using a relative path.
*
* You're free to add application-wide styles to this file and they'll appear at the top of the
* compiled file, but it's generally better to create a new file per style scope.
*
*= require_self
*= require_tree .
*/
Depot/config/environments/production.rb
Depot::Application.configure do
# Settings specified here will take precedence over those in config/application.rb.
# Code is not reloaded between requests.
config.cache_classes = true
# Eager load code on boot. This eager loads most of Rails and
# your application in memory, allowing both thread web servers
# and those relying on copy on write to perform better.
# Rake tasks automatically ignore this option for performance.
config.eager_load = true
# Full error reports are disabled and caching is turned on.
config.consider_all_requests_local = false
config.action_controller.perform_caching = true
# Enable Rack::Cache to put a simple HTTP cache in front of your application
# Add `rack-cache` to your Gemfile before enabling this.
# For large-scale production use, consider using a caching reverse proxy like nginx, varnish or squid.
# config.action_dispatch.rack_cache = true
# Disable Rails's static asset server (Apache or nginx will already do this).
config.serve_static_assets = true
# Compress JavaScripts and CSS.
config.assets.js_compressor = :uglifier
# config.assets.css_compressor = :sass
# Do not fallback to assets pipeline if a precompiled asset is missed.
config.assets.compile = true
config.assets.precompile = ['*.js', '*.css', '*.css.erb']
# Generate digests for assets URLs.
config.assets.digest = true
# Version of your assets, change this if you want to expire all your assets.
config.assets.version = '1.0'
# Specifies the header that your server uses for sending files.
# config.action_dispatch.x_sendfile_header = "X-Sendfile" # for apache
# config.action_dispatch.x_sendfile_header = 'X-Accel-Redirect' # for nginx
# Force all access to the app over SSL, use Strict-Transport-Security, and use secure cookies.
# config.force_ssl = true
# Set to :debug to see everything in the log.
config.log_level = :info
# Prepend all log lines with the following tags.
# config.log_tags = [ :subdomain, :uuid ]
# Use a different logger for distributed setups.
# config.logger = ActiveSupport::TaggedLogging.new(SyslogLogger.new)
# Use a different cache store in production.
# config.cache_store = :mem_cache_store
# Enable serving of images, stylesheets, and JavaScripts from an asset server.
# config.action_controller.asset_host = "http://assets.example.com"
# Precompile additional assets.
# application.js, application.css, and all non-JS/CSS in app/assets folder are already added.
# config.assets.precompile += %w( search.js )
# Ignore bad email addresses and do not raise email delivery errors.
# Set this to true and configure the email server for immediate delivery to raise delivery errors.
# config.action_mailer.raise_delivery_errors = false
# Enable locale fallbacks for I18n (makes lookups for any locale fall back to
# the I18n.default_locale when a translation can not be found).
config.i18n.fallbacks = true
# Send deprecation notices to registered listeners.
config.active_support.deprecation = :notify
# Disable automatic flushing of the log to improve performance.
# config.autoflush_log = false
# Use default logging formatter so that PID and timestamp are not suppressed.
config.log_formatter = ::Logger::Formatter.new
end
Depot::Application.configure do
# Settings specified here will take precedence over those in config/application.rb.
# Code is not reloaded between requests.
config.cache_classes = true
# Eager load code on boot. This eager loads most of Rails and
# your application in memory, allowing both thread web servers
# and those relying on copy on write to perform better.
# Rake tasks automatically ignore this option for performance.
config.eager_load = true
# Full error reports are disabled and caching is turned on.
config.consider_all_requests_local = false
config.action_controller.perform_caching = true
# Enable Rack::Cache to put a simple HTTP cache in front of your application
# Add `rack-cache` to your Gemfile before enabling this.
# For large-scale production use, consider using a caching reverse proxy like nginx, varnish or squid.
# config.action_dispatch.rack_cache = true
# Disable Rails's static asset server (Apache or nginx will already do this).
config.serve_static_assets = true
# Compress JavaScripts and CSS.
config.assets.js_compressor = :uglifier
# config.assets.css_compressor = :sass
# Do not fallback to assets pipeline if a precompiled asset is missed.
config.assets.compile = true
config.assets.precompile = ['*.js', '*.css', '*.css.erb']
# Generate digests for assets URLs.
config.assets.digest = true
# Version of your assets, change this if you want to expire all your assets.
config.assets.version = '1.0'
# Specifies the header that your server uses for sending files.
# config.action_dispatch.x_sendfile_header = "X-Sendfile" # for apache
# config.action_dispatch.x_sendfile_header = 'X-Accel-Redirect' # for nginx
# Force all access to the app over SSL, use Strict-Transport-Security, and use secure cookies.
# config.force_ssl = true
# Set to :debug to see everything in the log.
config.log_level = :info
# Prepend all log lines with the following tags.
# config.log_tags = [ :subdomain, :uuid ]
# Use a different logger for distributed setups.
# config.logger = ActiveSupport::TaggedLogging.new(SyslogLogger.new)
# Use a different cache store in production.
# config.cache_store = :mem_cache_store
# Enable serving of images, stylesheets, and JavaScripts from an asset server.
# config.action_controller.asset_host = "http://assets.example.com"
# Precompile additional assets.
# application.js, application.css, and all non-JS/CSS in app/assets folder are already added.
# config.assets.precompile += %w( search.js )
# Ignore bad email addresses and do not raise email delivery errors.
# Set this to true and configure the email server for immediate delivery to raise delivery errors.
# config.action_mailer.raise_delivery_errors = false
# Enable locale fallbacks for I18n (makes lookups for any locale fall back to
# the I18n.default_locale when a translation can not be found).
config.i18n.fallbacks = true
# Send deprecation notices to registered listeners.
config.active_support.deprecation = :notify
# Disable automatic flushing of the log to improve performance.
# config.autoflush_log = false
# Use default logging formatter so that PID and timestamp are not suppressed.
config.log_formatter = ::Logger::Formatter.new
end
This solved it for me.
CSS is not loading in app
Be sure to test your CSS with something more obvious, like
color: red;
somewhere. The agile book is only aligning the text here. It could be easily missed.
I am parsing the links on wikipedia pages of actors, and trying to find links to films they appeared in.
I have a basic method that searchs the links and checks for the word film in the link. However many of the links to films do not actually contain this word.
However, within the paragraphs that the links are contained in, the word film appears , for example:
<p>Dreyfuss's first film part was a small, uncredited role in
<i><a href="/wiki/The_Graduate" title="The Graduate">The Graduate
// Paragraph goes on for a long time.
Here is the block from the method that checks all the links:
all_links = doca.search('//a[#href]')
all_links.each do |link|
link_info = link['href']
if link_info.include?("(film)") && !(link_info.include?("Category:") || link_info.include?("php"))
then out << link_info end
end
out.uniq.collect {|link| strip_out_name(link)}
Would there be a way of checking the previous text before the link but after the <p> tag for the word film, but being careful not to check other links (and also perhaps limited the search to 50 characters before the link)?
Thanks for any help or suggestions.
Click here, this is the main page that I am testing on
It is possible to search for text inside a tag. See https://stackoverflow.com/a/19816840/128421 for an example.
But, I'd do it something similar to this way:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open('http://en.wikipedia.org/wiki/Richard_Dreyfuss'))
table = doc.at('#Filmography').parent.next_element
films = table.search('tr')[1..-1].map{ |tr|
tds = tr.search('td')
year = tds.shift.text
movie = tds.shift
movie_url = movie.at('a')['href']
movie_title = movie.at('a').text
role = tds.shift.text
{
year: year,
movie_url: movie_url,
movie_title: movie_title,
role: role
}
}
films
# => [{:year=>"1966",
# :movie_url=>"/wiki/Bewitched",
# :movie_title=>"Bewitched",
# :role=>"Rodney"},
# {:year=>"1966",
# :movie_url=>"/wiki/Gidget_(TV_series)",
# :movie_title=>"Gidget",
# :role=>"Durf the Drag"},
# {:year=>"1967",
# :movie_url=>"/wiki/Valley_of_the_Dolls_(film)",
# :movie_title=>"Valley of the Dolls",
# :role=>"Assistant stage manager"},
# {:year=>"1967",
# :movie_url=>"/wiki/The_Graduate",
# :movie_title=>"The Graduate",
# :role=>"Boarding House Resident"},
# {:year=>"1967",
# :movie_url=>"/wiki/The_Big_Valley",
# :movie_title=>"The Big Valley",
# :role=>"Lud Akley"},
# {:year=>"1968",
# :movie_url=>"/wiki/The_Young_Runaways",
# :movie_title=>"The Young Runaways",
# :role=>"Terry"},
# {:year=>"1969",
# :movie_url=>"/wiki/Hello_Down_There",
# :movie_title=>"Hello Down There",
# :role=>"Harold Webster"},
# {:year=>"1970",
# :movie_url=>"/wiki/The_Mod_Squad",
# :movie_title=>"The Mod Squad",
# :role=>"Curtis Bell"},
# {:year=>"1973",
# :movie_url=>"/wiki/American_Graffiti",
# :movie_title=>"American Graffiti",
# :role=>"Curt Henderson"},
# {:year=>"1973",
# :movie_url=>"/wiki/Dillinger_(1973_film)",
# :movie_title=>"Dillinger",
# :role=>"Baby Face Nelson"},
# {:year=>"1974",
# :movie_url=>"/wiki/The_Apprenticeship_of_Duddy_Kravitz_(film)",
# :movie_title=>"The Apprenticeship of Duddy Kravitz",
# :role=>"Duddy"},
# {:year=>"1974",
# :movie_url=>"/wiki/The_Second_Coming_of_Suzanne",
# :movie_title=>"The Second Coming of Suzanne",
# :role=>"Clavius"},
# {:year=>"1975",
# :movie_url=>"/wiki/Inserts_(film)",
# :movie_title=>"Inserts",
# :role=>"The Boy Wonder"},
# {:year=>"1975",
# :movie_url=>"/wiki/Jaws_(film)",
# :movie_title=>"Jaws",
# :role=>"Matt Hooper"},
# {:year=>"1976",
# :movie_url=>"/wiki/Victory_at_Entebbe",
# :movie_title=>"Victory at Entebbe",
# :role=>"Colonel Yonatan 'Yonni' Netanyahu"},
# {:year=>"1977",
# :movie_url=>"/wiki/Close_Encounters_of_the_Third_Kind",
# :movie_title=>"Close Encounters of the Third Kind",
# :role=>"Roy Neary"},
# {:year=>"1977",
# :movie_url=>"/wiki/The_Goodbye_Girl",
# :movie_title=>"The Goodbye Girl",
# :role=>"Elliott Garfield"},
# {:year=>"1978",
# :movie_url=>"/wiki/The_Big_Fix",
# :movie_title=>"The Big Fix",
# :role=>"Moses Wine"},
# {:year=>"1980",
# :movie_url=>"/wiki/The_Competition_(film)",
# :movie_title=>"The Competition",
# :role=>"Paul Dietrich"},
# {:year=>"1981",
# :movie_url=>"/wiki/Whose_Life_Is_It_Anyway%3F_(1981_film)",
# :movie_title=>"Whose Life Is It Anyway?",
# :role=>"Ken Harrison"},
# {:year=>"1984",
# :movie_url=>"/wiki/The_Buddy_System_(film)",
# :movie_title=>"The Buddy System",
# :role=>"Joe"},
# {:year=>"1986",
# :movie_url=>"/wiki/Down_and_Out_in_Beverly_Hills",
# :movie_title=>"Down and Out in Beverly Hills",
# :role=>"David 'Dave' Whiteman"},
# {:year=>"1986",
# :movie_url=>"/wiki/Stand_by_Me_(film)",
# :movie_title=>"Stand by Me",
# :role=>"Narrator/Gordie LaChance (adult)"},
# {:year=>"1987",
# :movie_url=>"/wiki/Tin_Men",
# :movie_title=>"Tin Men",
# :role=>"Bill 'BB' Babowsky"},
# {:year=>"1987",
# :movie_url=>"/wiki/Stakeout_(1987_film)",
# :movie_title=>"Stakeout",
# :role=>"Det. Chris Lecce"},
# {:year=>"1987",
# :movie_url=>"/wiki/Nuts_(film)",
# :movie_title=>"Nuts",
# :role=>"Aaron Levinsky"},
# {:year=>"1988",
# :movie_url=>"/wiki/Moon_Over_Parador",
# :movie_title=>"Moon Over Parador",
# :role=>"Jack Noah/President Alphonse Simms"},
# {:year=>"1989",
# :movie_url=>"/wiki/Let_It_Ride_(film)",
# :movie_title=>"Let It Ride",
# :role=>"Jay Trotter"},
# {:year=>"1989",
# :movie_url=>"/wiki/Always_(1989_film)",
# :movie_title=>"Always",
# :role=>"Pete Sandich"},
# {:year=>"1990",
# :movie_url=>"/wiki/Rosencrantz_%26_Guildenstern_Are_Dead_(film)",
# :movie_title=>"Rosencrantz & Guildenstern Are Dead",
# :role=>"The Player"},
# {:year=>"1990",
# :movie_url=>"/wiki/Postcards_from_the_Edge_(film)",
# :movie_title=>"Postcards from the Edge",
# :role=>"Doctor Frankenthal"},
# {:year=>"1991",
# :movie_url=>"/wiki/Once_Around",
# :movie_title=>"Once Around",
# :role=>"Sam Sharpe"},
# {:year=>"1991",
# :movie_url=>"/wiki/Prisoner_of_Honor",
# :movie_title=>"Prisoner of Honor",
# :role=>"Col. Picquart"},
# {:year=>"1991",
# :movie_url=>"/wiki/What_About_Bob%3F",
# :movie_title=>"What About Bob?",
# :role=>"Dr. Leo Marvin"},
# {:year=>"1993",
# :movie_url=>"/wiki/Lost_in_Yonkers_(film)",
# :movie_title=>"Lost in Yonkers",
# :role=>"Louie Kurnitz"},
# {:year=>"1993",
# :movie_url=>"/wiki/Another_Stakeout",
# :movie_title=>"Another Stakeout",
# :role=>"Detective Chris Lecce"},
# {:year=>"1994",
# :movie_url=>"/wiki/Silent_Fall",
# :movie_title=>"Silent Fall",
# :role=>"Dr. Jake Rainer"},
# {:year=>"1995",
# :movie_url=>
# "/w/index.php?title=The_Last_Word_(1995_film)&action=edit&redlink=1",
# :movie_title=>"The Last Word",
# :role=>"Larry"},
# {:year=>"1995",
# :movie_url=>"/wiki/The_American_President_(film)",
# :movie_title=>"The American President",
# :role=>"Senator Bob Rumson"},
# {:year=>"1995",
# :movie_url=>"/wiki/Mr._Holland%27s_Opus",
# :movie_title=>"Mr. Holland's Opus",
# :role=>"Glenn Holland"},
# {:year=>"1996",
# :movie_url=>"/wiki/James_and_the_Giant_Peach_(film)",
# :movie_title=>"James and the Giant Peach",
# :role=>"Centipede (voice)"},
# {:year=>"1996",
# :movie_url=>"/wiki/Mad_Dog_Time",
# :movie_title=>"Mad Dog Time",
# :role=>"Vic"},
# {:year=>"1997",
# :movie_url=>"/wiki/Night_Falls_on_Manhattan",
# :movie_title=>"Night Falls on Manhattan",
# :role=>"Sam Vigoda"},
# {:year=>"1997",
# :movie_url=>"/wiki/Oliver_Twist_(1997_film)",
# :movie_title=>"Oliver Twist",
# :role=>"Fagin"},
# {:year=>"1998",
# :movie_url=>"/wiki/Krippendorf%27s_Tribe",
# :movie_title=>"Krippendorf's Tribe",
# :role=>"Prof. James Krippendorf"},
# {:year=>"1999",
# :movie_url=>"/wiki/Lansky_(film)",
# :movie_title=>"Lansky",
# :role=>"Meyer Lansky"},
# {:year=>"2000",
# :movie_url=>"/wiki/The_Crew_(2000_film)",
# :movie_title=>"The Crew",
# :role=>"Bobby Bartellemeo/Narrator"},
# {:year=>"2000",
# :movie_url=>"/wiki/Fail_Safe_(2000_TV)",
# :movie_title=>"Fail Safe",
# :role=>"President of the United States"},
# {:year=>"2001",
# :movie_url=>"/wiki/The_Old_Man_Who_Read_Love_Stories",
# :movie_title=>"The Old Man Who Read Love Stories",
# :role=>"Antonio Bolivar"},
# {:year=>"2001",
# :movie_url=>"/wiki/Who_Is_Cletis_Tout%3F",
# :movie_title=>"Who Is Cletis Tout?",
# :role=>"Micah Donnelly"},
# {:year=>"2001",
# :movie_url=>"/wiki/The_Education_of_Max_Bickford",
# :movie_title=>"The Education of Max Bickford",
# :role=>"Max Bickford"},
# {:year=>"2001",
# :movie_url=>"/wiki/The_Day_Reagan_Was_Shot",
# :movie_title=>"The Day Reagan Was Shot",
# :role=>"Alexander Haig"},
# {:year=>"2003",
# :movie_url=>"/wiki/Coast_to_Coast_(TV_film)",
# :movie_title=>"Coast to Coast",
# :role=>"Barnaby Pierce"},
# {:year=>"2004",
# :movie_url=>"/wiki/Silver_City_(2004_film)",
# :movie_title=>"Silver City",
# :role=>"Chuck Raven"},
# {:year=>"2006",
# :movie_url=>"/wiki/Poseidon_(film)",
# :movie_title=>"Poseidon",
# :role=>"Richard Nelson"},
# {:year=>"2007",
# :movie_url=>"/wiki/Tin_Man_(TV_miniseries)",
# :movie_title=>"Tin Man",
# :role=>"Mystic Man"},
# {:year=>"2007",
# :movie_url=>"/wiki/Ocean_of_Fear",
# :movie_title=>"Ocean of Fear",
# :role=>"Narrator"},
# {:year=>"2008",
# :movie_url=>"/wiki/Signs_of_the_Time_(film)",
# :movie_title=>"Signs of the Time",
# :role=>"Narrator"},
# {:year=>"2008",
# :movie_url=>"/wiki/W._(film)",
# :movie_title=>"W.",
# :role=>"Dick Cheney"},
# {:year=>"2008",
# :movie_url=>"/w/index.php?title=America_Betrayed&action=edit&redlink=1",
# :movie_title=>"America Betrayed",
# :role=>"Narrator"},
# {:year=>"2009",
# :movie_url=>"/wiki/My_Life_in_Ruins",
# :movie_title=>"My Life in Ruins",
# :role=>"Irv"},
# {:year=>"2009",
# :movie_url=>"/wiki/Leaves_of_Grass_(film)",
# :movie_title=>"Leaves of Grass",
# :role=>"Pug Rothbaum"},
# {:year=>"2009",
# :movie_url=>"/wiki/The_Lightkeepers",
# :movie_title=>"The Lightkeepers",
# :role=>"Seth"},
# {:year=>"2010",
# :movie_url=>"/wiki/Piranha_3D",
# :movie_title=>"Piranha 3D",
# :role=>"Matthew Boyd"},
# {:year=>"2010",
# :movie_url=>"/wiki/Weeds_(TV_series)",
# :movie_title=>"Weeds",
# :role=>"Warren Schiff"},
# {:year=>"2010",
# :movie_url=>"/wiki/RED_(film)",
# :movie_title=>"RED",
# :role=>"Alexander Dunning"},
# {:year=>"2012",
# :movie_url=>"/wiki/Coma_(U.S._miniseries)",
# :movie_title=>"Coma",
# :role=>"Professor Hillside"},
# {:year=>"2013",
# :movie_url=>"/wiki/Very_Good_Girls",
# :movie_title=>"Very Good Girls",
# :role=>"Danny, Gerry's father"},
# {:year=>"2013",
# :movie_url=>"/wiki/Paranoia_(2013_film)",
# :movie_title=>"Paranoia",
# :role=>"Francis Cassidy"}]
To explain what it's doing:
The "Filmology" table is a good source for the information; It's organized logically, so writing code to walk through it is easy.
doc.at('#Filmography').parent.next_element
finds that table using the <h2> heading just above it, then backs up and looks in the next tag, which is the table itself.
table.search('tr')[1..-1] finds the <tr> rows inside the table, skips the first, then iterates (using map) over the remaining ones.
tds = tr.search('td') finds the cells for the table. From that point on it's a matter of peeling that NodeSet apart like an array, by looking at the elements I want. The rest of the code should be pretty obvious. Once the individual parts are retrieved that are of interest they're bundled into a hash, which is returned as part of an array of hashes by map.
Why not try parsing out the filmography section of the wikipedia article? It seems pretty standard across the few actors that I looked at, and it mentions whether or not it was a TV series so you could filter those out easily.
<tr>
<td>1966</td>
<td><i>Gidget</i></td>
<td>Durf the Drag</td>
<td>TV series 1 episode</td>
</tr>
<tr>
<td>1967</td>
<td><i>Valley of the Dolls</i></td>
<td>Assistant stage manager</td>
<td>Uncredited</td>
</tr>
Looks like you could pull nodes similar to this from the code and save all the info to do what you want with it. The first node could be disregarded since "TV" appears multiple times in the different subnodes.
Hope this helps!
-Larry
Okay So I have tested the code based on your actual request and come up with the following
url = "http://en.wikipedia.org/wiki/Richard_Dreyfuss"
doc = Nokogiri::HTML(open(url))
all_links = doc.search("//a[#href]")
all_links.each do |link|
p_text = link.ancestors("p").text
link_index = p_text.index(link.text)
unless link_index.nil?
search_back = link_index > 50 ? link_index - 50 : 0
p_text[search_back..link_index].downcase.include?("film") ? puts(link['href']) : nil
end
end
Output
#=>/wiki/American_Graffiti
/wiki/Jaws_(film)
/wiki/Close_Encounters_of_the_Third_Kind
/wiki/The_Graduate
/wiki/The_Apprenticeship_of_Duddy_Kravitz_(film)
/wiki/Down_And_Out_In_Beverly_Hills
/wiki/Stakeout_(1987_film)
/wiki/Stephen_King
/wiki/The_Body_(novella)
/wiki/Poseidon_(film)
#cite_note-27
/wiki/Jonathan_Tasini
This seems to satisfy the question you were asking but obviously needs to be modified to fit your needs.
Edit
Added your request for running back on 50 characters in the paragraph the response is much shorter now but I am not sure that the results will be as useful as you'd like. This answers the question but does not capture exactly what you are hoping for e.g. the last 2 links are not to films but they are within 50 characters of the world film.