Class: Wikiwhat::Text

Inherits:
Results show all
Defined in:
lib/wikiwhat/parse.rb

Overview

Extract portions of text from Wiki article

Instance Method Summary collapse

Methods inherited from Results

#pull_from_hash

Constructor Details

#initialize(api_return, prop = 'extract') ⇒ Text

Returns a new instance of Text.



30
31
32
33
34
35
# File 'lib/wikiwhat/parse.rb', line 30

def initialize(api_return, prop='extract')
  @request = self.pull_from_hash(api_return, prop)
  if @request.class == Array
    @request = self.pull_from_hash(@request[0], "*")
  end
end

Instance Method Details

#find_header(header) ⇒ Object

Find all paragraphs under a given heading

header = the name of the header as a String paras = the number of paragraphs

Return a String.



71
72
73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/wikiwhat/parse.rb', line 71

def find_header(header)
  # Find the requested header
  start = @request.index(header)
  if start
    # Find next instance of the tag.
    end_first_tag = start + @request[start..-1].index("h2") + 3
    # Find
    start_next_tag = @request[end_first_tag..-1].index("h2") + end_first_tag - 2
    # Select substring of requested text.
    @request[end_first_tag..start_next_tag]
  else
    raise Wikiwhat::WikiwhatError.new("Sorry, that header isn't on this page.")
  end
end

#only_text(string) ⇒ Object

Removes HTML tags from a String

string - a String that contains HTML tags.

Returns the string without HTML tags.



91
92
93
# File 'lib/wikiwhat/parse.rb', line 91

def only_text(string)
  no_html_tags = string.gsub(/<\/?.*?>/,'')
end

#paragraph(quantity) ⇒ Object

Returns the requested number of paragraphs of a Wiki article

quantity - the Number of paragraphs to be returned starting from the top

of the article. Defaults is to get the first paragraph.

Return an array of strings.



43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# File 'lib/wikiwhat/parse.rb', line 43

def paragraph(quantity)
  # Break the article into individual paragraphs and store in an array.
  start = @request.split("</p>")

  # Re-add the closing paragraph HTML tags.
  start.each do |string|
    string << "</p>"
  end

  # Check to make sure the quantity being requested is not more paragraphs
  # than exist.
  #
  # Return the correct number of paragraphs assigned to new_arr
  if start.length < quantity
    quantity = start.length - 1
    new_arr = start[0..quantity]
  else
    quantity = quantity - 1
    new_arr = start[0..quantity]
  end
end

#refsObject

Find all references on a page.

Return all refrences as an array of arrays.

TODO: Currently nested array, want to return as array of strings.



128
129
130
131
132
133
# File 'lib/wikiwhat/parse.rb', line 128

def refs
  @content = content_split(1, 2)

  #add all references to an array. still in wiki markup
  @content.scan(/<ref>(.*?)<\/ref>/)
end

Find the image from the sidebar, if one exists

Return the url of the image as a String.



103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# File 'lib/wikiwhat/parse.rb', line 103

def sidebar_image
  # Check to see if a sidebar image exists
  if content_split(0)[/(image\s* =\s*).*?\w(\.\w\w(g|f))/]
    # Grab the sidebar image title
    image_name = content_split(0)[/(image\s* =\s*).*?\w(\.\w\w(g|f))/]
    # Remove the 'image = ' part of the string
    image_name = image_name.split("= ")[1]
    # Call Wikipedia for image url
    get_url = Wikiwhat::Call.call_api(('File:'+ image_name), :prop => "imageinfo", :iiprop => true)
    # Pull url from hash
    img_name_2 = pull_from_hash(get_url, "pages")
    img_array = pull_from_hash(img_name_2, "imageinfo")
    img_array[0]["url"]
  else
    # If no sidebar image exists, raise error.
    raise Wikiwhat::WikiwhatError.new("Sorry, it looks like there is no sidebar image
      on this page.")
  end
end