Class: Ariel::CandidateSelector
- Inherits:
-
Object
- Object
- Ariel::CandidateSelector
- Defined in:
- lib/ariel/candidate_selector.rb
Overview
Given an array of candidate Rules, and an array of LabeledStreams, allows heuristics to be applied to select the ideal Rule. All select_* instance methods will remove candidates from the internal candidates array.
Instance Attribute Summary collapse
-
#candidates ⇒ Object
Returns the value of attribute candidates.
Instance Method Summary collapse
-
#highest_scoring_by(&scorer) ⇒ Object
Takes a scoring function as a block, and yields each rule to it.
-
#initialize(candidates, examples) ⇒ CandidateSelector
constructor
A new instance of CandidateSelector.
-
#random_from_remaining ⇒ Object
Returns a random candidate.
-
#score_by ⇒ Object
All scoring functions use this indirectly.
-
#select_best_by_match_type(*match_types) ⇒ Object
Selects the Rule candidates that have the most matches of a given type against the given examples.
- #select_closest_to_label ⇒ Object
- #select_with_fewer_wildcards ⇒ Object
- #select_with_longer_end_landmarks ⇒ Object
Constructor Details
#initialize(candidates, examples) ⇒ CandidateSelector
Returns a new instance of CandidateSelector.
9 10 11 12 |
# File 'lib/ariel/candidate_selector.rb', line 9 def initialize(candidates, examples) @candidates=candidates.dup #Just in case a CandidateSelector function directly modifies the array, affecting the original. Shouldn't happen. @examples=examples end |
Instance Attribute Details
#candidates ⇒ Object
Returns the value of attribute candidates.
8 9 10 |
# File 'lib/ariel/candidate_selector.rb', line 8 def candidates @candidates end |
Instance Method Details
#highest_scoring_by(&scorer) ⇒ Object
Takes a scoring function as a block, and yields each rule to it. Returns an array of the Rule candidates that have the highest score.
45 46 47 48 49 50 51 52 53 54 |
# File 'lib/ariel/candidate_selector.rb', line 45 def highest_scoring_by(&scorer) score_hash = score_by &scorer best_score = score_hash.values.sort.last highest_scorers=[] score_hash.each do |candidate_index, score| highest_scorers << @candidates[candidate_index] if score==best_score end debug "#{highest_scorers.size} highest_scorers were found, with a score of #{best_score}" return highest_scorers end |
#random_from_remaining ⇒ Object
Returns a random candidate. Meant for making the final choice in case previous selections have still left multiple candidates.
89 90 91 92 |
# File 'lib/ariel/candidate_selector.rb', line 89 def random_from_remaining debug "Selecting random from last #{candidates.size} candidate rules" @candidates.sort_by {rand}.first end |
#score_by ⇒ Object
All scoring functions use this indirectly. It iterates over each Rule candidate, and assigns it a score in a hash of index:score pairs. Each rule is yielded to the given block, which is expected to return that rule’s score.
35 36 37 38 39 40 41 |
# File 'lib/ariel/candidate_selector.rb', line 35 def score_by score_hash={} @candidates.each_with_index do |rule, index| score_hash[index]= yield rule end return score_hash end |
#select_best_by_match_type(*match_types) ⇒ Object
Selects the Rule candidates that have the most matches of a given type against the given examples. e.g. select_best_by_match_type(:early, :perfect) will select the rules that have the most matches that are early or perfect.
18 19 20 21 22 23 24 25 26 27 28 29 |
# File 'lib/ariel/candidate_selector.rb', line 18 def select_best_by_match_type(*match_types) debug "Selecting best by match types #{match_types}" return @candidates if @candidates.size==1 @candidates = highest_scoring_by do |rule| rule_score=0 @examples.each do |example| rule_score+=1 if rule.matches(example, *match_types) end rule_score #why doesn't return rule_score raise an error? end return @candidates end |
#select_closest_to_label ⇒ Object
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'lib/ariel/candidate_selector.rb', line 62 def select_closest_to_label debug "Selecting rules that match the examples closest to the label" @candidates = highest_scoring_by do |rule| rule_score=0 matched_examples=0 @examples.each do |example| match_index = rule.apply_to(example) if match_index.nil? next else rule_score+= (example.label_index - match_index).abs matched_examples+=1 end end rule_score = rule_score.to_f/matched_examples unless matched_examples==0 #mean distance from label_index -rule_score #So highest scoring = closest to label index. end return @candidates end |
#select_with_fewer_wildcards ⇒ Object
56 57 58 59 60 |
# File 'lib/ariel/candidate_selector.rb', line 56 def select_with_fewer_wildcards debug "Selecting the rules with the fewest wildcards" @candidates = highest_scoring_by {|rule| -rule.wildcard_count} #hack or not? return @candidates end |
#select_with_longer_end_landmarks ⇒ Object
82 83 84 85 |
# File 'lib/ariel/candidate_selector.rb', line 82 def select_with_longer_end_landmarks debug "Selecting rules that have longer end landmarks" @candidates = highest_scoring_by {|rule| rule.landmarks.last.size unless rule.landmarks.last.nil?} end |