Class: Rsel::StudyHtml

Inherits:

Object

Object
Rsel::StudyHtml

show all

Includes:: Support

Defined in:: lib/rsel/study_html.rb

Overview

Class to study a web page: Parses it with Nokogiri, and allows searching and simplifying Selenium-like expressions.

Constant Summary collapse

NO_PAGE_LOADED = A large sentinel for @dirties.

Instance Method Summary collapse

#begin_section ⇒ Object
Store the current keep_clean status, and begin forcing study use until the next #end_section.
#clean? ⇒ Boolean
Return whether the studied page is clean and ready for analysis.
#dirty ⇒ Object
"Dirty" the studied page, marking one (potential) change since the page was studied.
#end_section ⇒ Object
Restore the keep_clean status from before the last #begin_section.
#get_node(locator) ⇒ Object
Find a studied node by almost any type of Selenium locator.
#initialize(first_page = nil) ⇒ StudyHtml constructor
A new instance of StudyHtml.
#keep_clean(switch) ⇒ Object
Turn on or off maintenance of clean status.
#keeping_clean? ⇒ Boolean
Return whether keep_clean is on or not.
#simplify_locator(locator, tocss = true) ⇒ Object
Simplify a Selenium-like locator (xpath or a css path), based on the studied page.
#study(page, keep = false) ⇒ Object
Load a page to study.
#undo_all_dirties ⇒ Object
Try to un-#dirty the studied page.
#undo_last_dirty ⇒ Object
Try to un-#dirty the studied page by marking one (potential) change since the page was studied not actually a change.

Methods included from Support

#apply_scope, #csspath, #escape_for_hash, #failed_within, #globify, #loc, #normalize_ids, #result_within, #selenium_compare, #string_is_true?, #strip_tags, #xpath, #xpath_expressions, #xpath_row_containing, #xpath_sanitize

Constructor Details

#initialize(first_page = nil) ⇒ `StudyHtml`

Returns a new instance of StudyHtml.

# File 'lib/rsel/study_html.rb', line 14

def initialize(first_page=nil)
  @sections_kept_clean = []
  if first_page
    study(first_page)
  else
    @studied_page = nil
    # Invariant: @dirties == 0 while @keep_clean is true.
    @keep_clean = false
    # No page is loaded.  Set a large sentinel value so it's never tested.
    @dirties = NO_PAGE_LOADED
  end
end

Instance Method Details

#begin_section ⇒ `Object`

Store the current keep_clean status, and begin forcing study use until the next #end_section.
A semi-optional block argument returns the first argument to give to #study. It's not required if #clean?, but otherwise if it's not present an exception will be thrown.

# File 'lib/rsel/study_html.rb', line 105

def begin_section
  last_keep_clean = @keep_clean
  if clean?
    @keep_clean = true
  else
    # This will erase all prior sections.
    study(yield, true)
  end
  @sections_kept_clean.push(last_keep_clean)
end

#clean? ⇒ `Boolean`

Return whether the studied page is clean and ready for analysis. Not a verb - does not #undo_all_dirties.

Returns:

(Boolean)



72
73
74

# File 'lib/rsel/study_html.rb', line 72

def clean?
  return @dirties == 0
end

#dirty ⇒ `Object`

"Dirty" the studied page, marking one (potential) change since the page was studied. This can be undone: see #undo_last_dirty Does nothing if #keep_clean has been called with true, which may occur from #study



50
51
52

# File 'lib/rsel/study_html.rb', line 50

def dirty
  @dirties += 1 unless @keep_clean
end

#end_section ⇒ `Object`

Restore the keep_clean status from before the last #begin_section. Also marks the page dirty unless the last keep_clean was true. It's fine to call this more than you call begin_section. It will act just like keep_clean(false) if it runs out of stack parameters.

# File 'lib/rsel/study_html.rb', line 120

def end_section
  # Can't just assign - what if nil is popped?
  if @sections_kept_clean.pop
    @keep_clean = true
  else
    @keep_clean = false
    dirty
  end
  return true
end

#get_node(locator) ⇒ `Object`

Find a studied node by almost any type of Selenium locator. Returns a Nokogiri::Node, or nil if not found.

# File 'lib/rsel/study_html.rb', line 172

def get_node(locator)
  return nil if @dirties > 0
  case locator
  when /^id=/, /^name=/
    locator = locator.gsub("'","\\\\'").gsub(/([a-z]+)=([^ ]*) */, "[@\\1='\\2']")
    locator = locator.sub(/\]([^ ]+) */, "][@value='\\1']")
    return @studied_page.at_xpath("//*#{locator}")
  when /^link=/
    # Parse the link through loc (which may simplify it to an id or something).
    # Then try get_studied_node again.  It should not return to this spot.
    return get_node(loc(locator[5,locator.length], 'link'))
  when /^css=/
    return @studied_page.at_css(locator[4,locator.length])
  when /^xpath=/, /^\/\//
    return @studied_page.at_xpath(locator.sub(/^xpath=/,''))
  when /^dom=/, /^document\./
    # Can't parse dom=
    return nil
  else
    locator = locator.sub(/^id(entifier)?=/,'')
    retval = @studied_page.at_xpath("//*[@id='#{locator}']")
    retval = @studied_page.at_xpath("//*[@name='#{locator}']") unless retval
    return retval
  end
end

#keep_clean(switch) ⇒ `Object`

Turn on or off maintenance of clean status. True prevents #dirty from having any effect. Also cleans all dirties (with #undo_all_dirties) if true, or dirties the page (with #dirty) if false.

# File 'lib/rsel/study_html.rb', line 79

def keep_clean(switch)
  if switch
    if undo_all_dirties
      @keep_clean = true
      return true
    else
      return false
    end
  else
    @keep_clean = false
    dirty
    return true
  end
end

#keeping_clean? ⇒ `Boolean`

Return whether keep_clean is on or not. Useful if you want to start keeping clean and then return to your previous state. Invariant: clean? == true if keeping_clean? == true.

Returns:

(Boolean)



96
97
98

# File 'lib/rsel/study_html.rb', line 96

def keeping_clean?
  return @keep_clean
end

#simplify_locator(locator, tocss = true) ⇒ `Object`

Simplify a Selenium-like locator (xpath or a css path), based on the studied page

Parameters:

x (Boolean)
tocss (Boolean) (defaults to: true) —
Return a css= path as a last resort? Defaults to true.

# File 'lib/rsel/study_html.rb', line 136

def simplify_locator(locator, tocss=true)
  return locator if @dirties > 0

  # We need a locator using either a css= or locator= expression.
  if locator[0,4] == 'css='
    studied_node = @studied_page.at_css(locator[4,locator.length])
    # If we're already using a css path, don't bother simplifying it to another css path.
    tocss = false
  elsif locator[0,6] == 'xpath=' || locator[0,2] == '//'
    locator = 'xpath='+locator if locator[0,2] == '//'
    studied_node = @studied_page.at_xpath(locator[6,locator.length])
  else
    # Some other kind of locator.  Just return it.
    return locator
  end
  # If the path wasn't found, just return the locator; maybe the browser will
  # have better luck.  (Or return a better error message!)
  return locator if studied_node == nil

  # Now let's try simplified locators.  First, id.
  return "id=#{studied_node['id']}" if(studied_node['id'] &&
                                       @studied_page.at_xpath("//*[@id='#{studied_node['id']}']") == studied_node)
  # Next, name.  Same pattern.
  return "name=#{studied_node['name']}" if(studied_node['name'] &&
                                           @studied_page.at_xpath("//*[@name='#{studied_node['name']}']") == studied_node)

  # Link, perhaps?
  return "link=#{studied_node.inner_text}" if(studied_node.node_name.downcase == 'a' && 
                                           @studied_page.at_xpath("//a[text()='#{studied_node.inner_text}']") == studied_node)

  # Finally, try a CSS path.  Make that a simple xpath, since nth-of-type doesn't work.  But give up if we were told not to convert to CSS.
  return locator unless tocss
  return "xpath=#{studied_node.path}"
end

#study(page, keep = false) ⇒ `Object`

Load a page to study.

Parameters:

page (String) —
Any argument that works for Nokogiri::HTML. Often HTML in a string, or a path to a file.
keep (Boolean) (defaults to: false) —
Sets #keep_clean with this argument. Default is false, so study(), by default, turns off keep_clean.

# File 'lib/rsel/study_html.rb', line 33

def study(page, keep=false)
  @sections_kept_clean = []
  begin
    @studied_page = Nokogiri::HTML(page)
    @dirties = 0
  rescue => e
    @keep_clean = false
    @dirties = NO_PAGE_LOADED
    @studied_page = nil
    raise e
  end
  @keep_clean = keep
end

#undo_all_dirties ⇒ `Object`

Try to un-#dirty the studied page. Returns true on success or false if there was no page to clean.

# File 'lib/rsel/study_html.rb', line 61

def undo_all_dirties
  if @studied_page != nil
    @dirties = 0
  else
    @keep_clean = false
    @dirties = NO_PAGE_LOADED
  end
  return @dirties == 0
end

#undo_last_dirty ⇒ `Object`

Try to un-#dirty the studied page by marking one (potential) change since the page was studied not actually a change. This may or may not be enough to resume using the studied page. Cannot be used preemptively.



56
57
58

# File 'lib/rsel/study_html.rb', line 56

def undo_last_dirty
  @dirties -= 1 unless @dirties <= 0
end

Class: Rsel::StudyHtml

Overview

Constant Summary collapse

Instance Method Summary collapse

Methods included from Support

Constructor Details

#initialize(first_page = nil) ⇒ StudyHtml

Instance Method Details

#begin_section ⇒ Object

#clean? ⇒ Boolean

#dirty ⇒ Object

#end_section ⇒ Object

#get_node(locator) ⇒ Object

#keep_clean(switch) ⇒ Object

#keeping_clean? ⇒ Boolean

#simplify_locator(locator, tocss = true) ⇒ Object

#study(page, keep = false) ⇒ Object

#undo_all_dirties ⇒ Object

#undo_last_dirty ⇒ Object