Class: Rsel::StudyHtml
- Inherits:
-
Object
- Object
- Rsel::StudyHtml
- Includes:
- Support
- Defined in:
- lib/rsel/study_html.rb
Overview
Class to study a web page: Parses it with Nokogiri, and allows searching and simplifying Selenium-like expressions.
Constant Summary collapse
- NO_PAGE_LOADED =
A large sentinel for @dirties.
1000000
Instance Method Summary collapse
-
#begin_section ⇒ Object
Store the current keep_clean status, and begin forcing study use until the next #end_section.
-
#clean? ⇒ Boolean
Return whether the studied page is clean and ready for analysis.
-
#dirty ⇒ Object
"Dirty" the studied page, marking one (potential) change since the page was studied.
-
#end_section ⇒ Object
Restore the keep_clean status from before the last #begin_section.
-
#get_node(locator) ⇒ Object
Find a studied node by almost any type of Selenium locator.
-
#initialize(first_page = nil) ⇒ StudyHtml
constructor
A new instance of StudyHtml.
-
#keep_clean(switch) ⇒ Object
Turn on or off maintenance of clean status.
-
#keeping_clean? ⇒ Boolean
Return whether keep_clean is on or not.
-
#simplify_locator(locator, tocss = true) ⇒ Object
Simplify a Selenium-like locator (xpath or a css path), based on the studied page.
-
#study(page, keep = false) ⇒ Object
Load a page to study.
-
#undo_all_dirties ⇒ Object
Try to un-#dirty the studied page.
-
#undo_last_dirty ⇒ Object
Try to un-#dirty the studied page by marking one (potential) change since the page was studied not actually a change.
Methods included from Support
#apply_scope, #csspath, #escape_for_hash, #failed_within, #globify, #loc, #normalize_ids, #result_within, #selenium_compare, #string_is_true?, #strip_tags, #xpath, #xpath_expressions, #xpath_row_containing, #xpath_sanitize
Constructor Details
#initialize(first_page = nil) ⇒ StudyHtml
14 15 16 17 18 19 20 21 22 23 24 25 |
# File 'lib/rsel/study_html.rb', line 14
def initialize(first_page=nil)
@sections_kept_clean = []
if first_page
study(first_page)
else
@studied_page = nil
# Invariant: @dirties == 0 while @keep_clean is true.
@keep_clean = false
# No page is loaded. Set a large sentinel value so it's never tested.
@dirties = NO_PAGE_LOADED
end
end
|
Instance Method Details
#begin_section ⇒ Object
Store the current keep_clean status, and begin forcing study use until the
next #end_section.
A semi-optional block argument returns the first argument to give to #study.
It's not required if #clean?, but otherwise if it's not present an exception
will be thrown.
105 106 107 108 109 110 111 112 113 114 |
# File 'lib/rsel/study_html.rb', line 105
def begin_section
last_keep_clean = @keep_clean
if clean?
@keep_clean = true
else
# This will erase all prior sections.
study(yield, true)
end
@sections_kept_clean.push(last_keep_clean)
end
|
#clean? ⇒ Boolean
Return whether the studied page is clean and ready for analysis. Not a verb - does not #undo_all_dirties.
72 73 74 |
# File 'lib/rsel/study_html.rb', line 72
def clean?
return @dirties == 0
end
|
#dirty ⇒ Object
"Dirty" the studied page, marking one (potential) change since the page was studied. This can be undone: see #undo_last_dirty Does nothing if #keep_clean has been called with true, which may occur from #study
50 51 52 |
# File 'lib/rsel/study_html.rb', line 50
def dirty
@dirties += 1 unless @keep_clean
end
|
#end_section ⇒ Object
Restore the keep_clean status from before the last #begin_section. Also marks the page dirty unless the last keep_clean was true. It's fine to call this more than you call begin_section. It will act just like keep_clean(false) if it runs out of stack parameters.
120 121 122 123 124 125 126 127 128 129 |
# File 'lib/rsel/study_html.rb', line 120
def end_section
# Can't just assign - what if nil is popped?
if @sections_kept_clean.pop
@keep_clean = true
else
@keep_clean = false
dirty
end
return true
end
|
#get_node(locator) ⇒ Object
Find a studied node by almost any type of Selenium locator. Returns a Nokogiri::Node, or nil if not found.
172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
# File 'lib/rsel/study_html.rb', line 172
def get_node(locator)
return nil if @dirties > 0
case locator
when /^id=/, /^name=/
locator = locator.gsub("'","\\\\'").gsub(/([a-z]+)=([^ ]*) */, "[@\\1='\\2']")
locator = locator.sub(/\]([^ ]+) */, "][@value='\\1']")
return @studied_page.at_xpath("//*#{locator}")
when /^link=/
# Parse the link through loc (which may simplify it to an id or something).
# Then try get_studied_node again. It should not return to this spot.
return get_node(loc(locator[5,locator.length], 'link'))
when /^css=/
return @studied_page.at_css(locator[4,locator.length])
when /^xpath=/, /^\/\//
return @studied_page.at_xpath(locator.sub(/^xpath=/,''))
when /^dom=/, /^document\./
# Can't parse dom=
return nil
else
locator = locator.sub(/^id(entifier)?=/,'')
retval = @studied_page.at_xpath("//*[@id='#{locator}']")
retval = @studied_page.at_xpath("//*[@name='#{locator}']") unless retval
return retval
end
end
|
#keep_clean(switch) ⇒ Object
Turn on or off maintenance of clean status. True prevents #dirty from having any effect. Also cleans all dirties (with #undo_all_dirties) if true, or dirties the page (with #dirty) if false.
79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
# File 'lib/rsel/study_html.rb', line 79
def keep_clean(switch)
if switch
if undo_all_dirties
@keep_clean = true
return true
else
return false
end
else
@keep_clean = false
dirty
return true
end
end
|
#keeping_clean? ⇒ Boolean
Return whether keep_clean is on or not. Useful if you want to start keeping clean and then return to your previous state. Invariant: clean? == true if keeping_clean? == true.
96 97 98 |
# File 'lib/rsel/study_html.rb', line 96
def keeping_clean?
return @keep_clean
end
|
#simplify_locator(locator, tocss = true) ⇒ Object
Simplify a Selenium-like locator (xpath or a css path), based on the studied page
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
# File 'lib/rsel/study_html.rb', line 136
def simplify_locator(locator, tocss=true)
return locator if @dirties > 0
# We need a locator using either a css= or locator= expression.
if locator[0,4] == 'css='
studied_node = @studied_page.at_css(locator[4,locator.length])
# If we're already using a css path, don't bother simplifying it to another css path.
tocss = false
elsif locator[0,6] == 'xpath=' || locator[0,2] == '//'
locator = 'xpath='+locator if locator[0,2] == '//'
studied_node = @studied_page.at_xpath(locator[6,locator.length])
else
# Some other kind of locator. Just return it.
return locator
end
# If the path wasn't found, just return the locator; maybe the browser will
# have better luck. (Or return a better error message!)
return locator if studied_node == nil
# Now let's try simplified locators. First, id.
return "id=#{studied_node['id']}" if(studied_node['id'] &&
@studied_page.at_xpath("//*[@id='#{studied_node['id']}']") == studied_node)
# Next, name. Same pattern.
return "name=#{studied_node['name']}" if(studied_node['name'] &&
@studied_page.at_xpath("//*[@name='#{studied_node['name']}']") == studied_node)
# Link, perhaps?
return "link=#{studied_node.inner_text}" if(studied_node.node_name.downcase == 'a' &&
@studied_page.at_xpath("//a[text()='#{studied_node.inner_text}']") == studied_node)
# Finally, try a CSS path. Make that a simple xpath, since nth-of-type doesn't work. But give up if we were told not to convert to CSS.
return locator unless tocss
return "xpath=#{studied_node.path}"
end
|
#study(page, keep = false) ⇒ Object
Load a page to study.
33 34 35 36 37 38 39 40 41 42 43 44 45 |
# File 'lib/rsel/study_html.rb', line 33
def study(page, keep=false)
@sections_kept_clean = []
begin
@studied_page = Nokogiri::HTML(page)
@dirties = 0
rescue => e
@keep_clean = false
@dirties = NO_PAGE_LOADED
@studied_page = nil
raise e
end
@keep_clean = keep
end
|
#undo_all_dirties ⇒ Object
Try to un-#dirty the studied page. Returns true on success or false if there was no page to clean.
61 62 63 64 65 66 67 68 69 |
# File 'lib/rsel/study_html.rb', line 61
def undo_all_dirties
if @studied_page != nil
@dirties = 0
else
@keep_clean = false
@dirties = NO_PAGE_LOADED
end
return @dirties == 0
end
|
#undo_last_dirty ⇒ Object
Try to un-#dirty the studied page by marking one (potential) change since the page was studied not actually a change. This may or may not be enough to resume using the studied page. Cannot be used preemptively.
56 57 58 |
# File 'lib/rsel/study_html.rb', line 56
def undo_last_dirty
@dirties -= 1 unless @dirties <= 0
end
|