Class: Ariel::CandidateSelector

Inherits:
Object
  • Object
show all
Defined in:
lib/ariel/candidate_selector.rb

Overview

Given an array of candidate Rules, and an array of LabeledStreams, allows heuristics to be applied to select the ideal Rule. All select_* instance methods will remove candidates from the internal candidates array.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(candidates, examples) ⇒ CandidateSelector

Returns a new instance of CandidateSelector.



9
10
11
12
# File 'lib/ariel/candidate_selector.rb', line 9

def initialize(candidates, examples)
  @candidates=candidates.dup #Just in case a CandidateSelector function directly modifies the array, affecting the original. Shouldn't happen.
  @examples=examples
end

Instance Attribute Details

#candidatesObject

Returns the value of attribute candidates.



8
9
10
# File 'lib/ariel/candidate_selector.rb', line 8

def candidates
  @candidates
end

Instance Method Details

#highest_scoring_by(&scorer) ⇒ Object

Takes a scoring function as a block, and yields each rule to it. Returns an array of the Rule candidates that have the highest score.



45
46
47
48
49
50
51
52
53
54
# File 'lib/ariel/candidate_selector.rb', line 45

def highest_scoring_by(&scorer)
  score_hash = score_by &scorer
  best_score = score_hash.values.sort.last
  highest_scorers=[]
  score_hash.each do |candidate_index, score|
    highest_scorers << @candidates[candidate_index] if score==best_score
  end
  debug "#{highest_scorers.size} highest_scorers were found, with a score of #{best_score}"
  return highest_scorers
end

#random_from_remainingObject

Returns a random candidate. Meant for making the final choice in case previous selections have still left multiple candidates.



89
90
91
92
# File 'lib/ariel/candidate_selector.rb', line 89

def random_from_remaining
  debug "Selecting random from last #{candidates.size} candidate rules"
  @candidates.sort_by {rand}.first
end

#score_byObject

All scoring functions use this indirectly. It iterates over each Rule candidate, and assigns it a score in a hash of index:score pairs. Each rule is yielded to the given block, which is expected to return that rule’s score.



35
36
37
38
39
40
41
# File 'lib/ariel/candidate_selector.rb', line 35

def score_by
  score_hash={}
  @candidates.each_with_index do |rule, index|
    score_hash[index]= yield rule
  end
  return score_hash
end

#select_best_by_match_type(*match_types) ⇒ Object

Selects the Rule candidates that have the most matches of a given type against the given examples. e.g. select_best_by_match_type(:early, :perfect) will select the rules that have the most matches that are early or perfect.



18
19
20
21
22
23
24
25
26
27
28
29
# File 'lib/ariel/candidate_selector.rb', line 18

def select_best_by_match_type(*match_types)
  debug "Selecting best by match types #{match_types}"
  return @candidates if @candidates.size==1
  @candidates = highest_scoring_by do |rule|
    rule_score=0
    @examples.each do |example|
      rule_score+=1 if rule.matches(example, *match_types)
    end
    rule_score #why doesn't return rule_score raise an error?
  end
  return @candidates
end

#select_closest_to_labelObject



62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# File 'lib/ariel/candidate_selector.rb', line 62

def select_closest_to_label
  debug "Selecting rules that match the examples closest to the label"
  @candidates = highest_scoring_by do |rule|
    rule_score=0
    matched_examples=0
    @examples.each do |example|
      match_index = rule.apply_to(example)
      if match_index.nil?
        next
      else
        rule_score+= (example.label_index - match_index).abs
        matched_examples+=1
      end
    end
    rule_score = rule_score.to_f/matched_examples unless matched_examples==0 #mean distance from label_index
    -rule_score #So highest scoring = closest to label index.
  end
  return @candidates
end

#select_with_fewer_wildcardsObject



56
57
58
59
60
# File 'lib/ariel/candidate_selector.rb', line 56

def select_with_fewer_wildcards
  debug "Selecting the rules with the fewest wildcards"
  @candidates = highest_scoring_by {|rule| -rule.wildcard_count} #hack or not?
  return @candidates
end

#select_with_longer_end_landmarksObject



82
83
84
85
# File 'lib/ariel/candidate_selector.rb', line 82

def select_with_longer_end_landmarks
  debug "Selecting rules that have longer end landmarks"
  @candidates = highest_scoring_by {|rule| rule.landmarks.last.size unless rule.landmarks.last.nil?}
end