Module: Scrapers::GoComics
- Defined in:
- lib/scrapers/gocomics.rb
Constant Summary collapse
- GOCOMIC_URL =
"http://www.gocomics.com/"
Class Method Summary collapse
- .scrape(comic) ⇒ Object
- .scrape_image_source(page) ⇒ Object
- .scrape_pubdate(page) ⇒ Object
- .scrape_title(page) ⇒ Object
Class Method Details
.scrape(comic) ⇒ Object
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# File 'lib/scrapers/gocomics.rb', line 11 def self.scrape(comic) results = Hash.new results[:comic] = comic url = URI.parse GOCOMIC_URL url.path = "/#{comic}" results[:url] = url.to_s page = Nokogiri::HTML(open(url.to_s)) results[:title] = scrape_title(page) results[:pubdate] = scrape_pubdate(page) results[:img_src] = scrape_image_source(page) results end |
.scrape_image_source(page) ⇒ Object
40 41 42 43 44 45 |
# File 'lib/scrapers/gocomics.rb', line 40 def self.scrape_image_source(page) page. at_css("p.feature_item"). at_css("img"). attr("src") end |
.scrape_pubdate(page) ⇒ Object
36 37 38 |
# File 'lib/scrapers/gocomics.rb', line 36 def self.scrape_pubdate(page) Date.parse(page.at_css("ul.feature-nav > li").content).to_s end |
.scrape_title(page) ⇒ Object
32 33 34 |
# File 'lib/scrapers/gocomics.rb', line 32 def self.scrape_title(page) page.at_css("title").content.strip.gsub(/[[:space:]]/,' ').squeeze(" ") end |