Class: Wikiwhat::Text

Inherits:

Results

Object
Results
Wikiwhat::Text

show all

Defined in:: lib/wikiwhat/parse.rb

Overview

Extract portions of text from Wiki article

Instance Method Summary collapse

#find_header(header) ⇒ Object

Find all paragraphs under a given heading.
#initialize(api_return, prop = 'extract') ⇒ Text constructor

A new instance of Text.
#only_text(string) ⇒ Object

Removes HTML tags from a String.
#paragraph(quantity) ⇒ Object

Returns the requested number of paragraphs of a Wiki article.
#refs ⇒ Object

Find all references on a page.
#sidebar_image ⇒ Object

Find the image from the sidebar, if one exists.

Methods inherited from Results

#content_split, #pull_from_hash

Constructor Details

#initialize(api_return, prop = 'extract') ⇒ `Text`

Returns a new instance of Text.

# File 'lib/wikiwhat/parse.rb', line 46

def initialize(api_return, prop='extract')
  @request = self.pull_from_hash(api_return, prop)
  if @request.class == Array
    @request = self.pull_from_hash(@request[0], "*")
  end
end

Instance Method Details

#find_header(header) ⇒ `Object`

Find all paragraphs under a given heading

header = the name of the header as a String paras = the number of paragraphs

Return a String.

# File 'lib/wikiwhat/parse.rb', line 87

def find_header(header)
  # Find the requested header
  start = @request.index(header)
  if start
    # Find next instance of the tag.
    end_first_tag = start + @request[start..-1].index("h2") + 3
    # Find
    start_next_tag = @request[end_first_tag..-1].index("h2") + end_first_tag - 2
    # Select substring of requested text.
    @request[end_first_tag..start_next_tag]
  else
    raise Wikiwhat::WikiwhatError.new("Sorry, that header isn't on this page.")
  end
end

#only_text(string) ⇒ `Object`

Removes HTML tags from a String

string - a String that contains HTML tags.

Returns the string without HTML tags.



107
108
109

# File 'lib/wikiwhat/parse.rb', line 107

def only_text(string)
  no_html_tags = string.gsub(/<\/?.*?>/,'')
end

#paragraph(quantity) ⇒ `Object`

Returns the requested number of paragraphs of a Wiki article

quantity - the Number of paragraphs to be returned starting from the top

of the article. Defaults is to get the first paragraph.

Return an array of strings.

# File 'lib/wikiwhat/parse.rb', line 59

def paragraph(quantity)
  # Break the article into individual paragraphs and store in an array.
  start = @request.split("</p>")

  # Re-add the closing paragraph HTML tags.
  start.each do |string|
    string << "</p>"
  end

  # Check to make sure the quantity being requested is not more paragraphs
  # than exist.
  #
  # Return the correct number of paragraphs assigned to new_arr
  if start.length < quantity
    quantity = start.length - 1
    new_arr = start[0..quantity]
  else
    quantity = quantity - 1
    new_arr = start[0..quantity]
  end
end

#refs ⇒ `Object`

Find all references on a page.

Return all refrences as an array of arrays.

TODO: Currently nested array, want to return as array of strings.

# File 'lib/wikiwhat/parse.rb', line 145

def refs
  @content = content_split(1, 2)

  #add all references to an array. still in wiki markup
  @content.scan(/<ref>(.*?)<\/ref>/)
end

#sidebar_image ⇒ `Object`

Find the image from the sidebar, if one exists

Return the url of the image as a String.

# File 'lib/wikiwhat/parse.rb', line 119

def sidebar_image
  # Check to see if a sidebar image exists
  if self.content_split(0)[/(image).*?(\.\w\w(g|G|f|F))/]
    # Grab the sidebar image title
    image_name = self.content_split(0)[/(image).*?(\.\w\w(g|G|f|F))/]
    # Remove the 'image = ' part of the string
    image_name = image_name.split("=")[1].strip
    # Call Wikipedia for image url
    get_url = Wikiwhat::Call.call_api(('File:'+ image_name),
      :prop => "imageinfo", :iiprop => true)
    # Pull url from hash
    img_name_2 = pull_from_hash(get_url, "pages")
    img_array = pull_from_hash(img_name_2, "imageinfo")
    img_array[0]["url"]
  else
    # If no sidebar image exists, raise error.
    raise Wikiwhat::WikiwhatError.new("Sorry, it looks like there is no sidebar image
      on this page.")
  end
end

Class: Wikiwhat::Text

Overview

Instance Method Summary collapse

Methods inherited from Results

Constructor Details

#initialize(api_return, prop = 'extract') ⇒ Text

Instance Method Details

#find_header(header) ⇒ Object

#only_text(string) ⇒ Object

#paragraph(quantity) ⇒ Object