Module: Crawlers::Helpers::Content
- Included in:
- Rss
- Defined in:
- lib/crawlers/helpers/content.rb
Instance Method Summary collapse
Instance Method Details
#extract_primary_content(html_text) ⇒ Object
7 8 9 10 11 |
# File 'lib/crawlers/helpers/content.rb', line 7 def extract_primary_content(html_text) content = Readability::Document.new(html_text).content sanitized_content = Sanitize.clean(content) remove_trailing_spaces(sanitized_content) end |