Module: Crawlers::Helpers::Content

Included in:
Rss
Defined in:
lib/crawlers/helpers/content.rb

Instance Method Summary collapse

Instance Method Details

#extract_primary_content(html_text) ⇒ Object



7
8
9
10
11
# File 'lib/crawlers/helpers/content.rb', line 7

def extract_primary_content(html_text)
  content = Readability::Document.new(html_text).content
  sanitized_content = Sanitize.clean(content)
  remove_trailing_spaces(sanitized_content)
end