Class: Ariel::Token
- Inherits:
-
Object
- Object
- Ariel::Token
- Defined in:
- lib/ariel/token.rb
Overview
Tokens populate a TokenStream. They know their position in the original document, can list the wildcards that match them and determine whether a given string or wildcard is a valid match. During the process of parsing a labeled document, some tokens may be marked as being a label_tag. These are filtered from the TokenStream before the rule learning phase.
Instance Attribute Summary collapse
-
#end_loc ⇒ Object
readonly
Returns the value of attribute end_loc.
-
#start_loc ⇒ Object
readonly
Returns the value of attribute start_loc.
-
#text ⇒ Object
readonly
Returns the value of attribute text.
Instance Method Summary collapse
-
#<=>(t) ⇒ Object
Tokens are sorted based on their start_loc.
-
#==(t) ⇒ Object
Tokens are only equal if they have an equal start_loc, end_loc and text.
-
#initialize(text, start_loc, end_loc, label_tag = false) ⇒ Token
constructor
Each new Token must have a string representing its content, its start position in the original document (start_loc) and the point at which it ends (end_loc).
-
#is_label_tag? ⇒ Boolean
Returns true or false depending on whether the token was marked as a label tag when it was initialized.
-
#matches?(landmark) ⇒ Boolean
Accepts either a string a symbol representing a wildcard in Wildcards#list or an an arbitrary regex.
-
#matching_wildcards ⇒ Object
Returns an array of symbols corresponding to the Wildcards that match the Token.
Constructor Details
#initialize(text, start_loc, end_loc, label_tag = false) ⇒ Token
Each new Token must have a string representing its content, its start position in the original document (start_loc) and the point at which it ends (end_loc). For instance, in str=“This is an example”, if “is” were to be made a Token it would be given a start_loc of 5 and and end_loc of 7, which is str
16 17 18 19 20 21 |
# File 'lib/ariel/token.rb', line 16 def initialize(text, start_loc, end_loc, label_tag=false) @text=text.to_s @start_loc=start_loc @end_loc=end_loc @label_tag=label_tag end |
Instance Attribute Details
#end_loc ⇒ Object (readonly)
Returns the value of attribute end_loc.
9 10 11 |
# File 'lib/ariel/token.rb', line 9 def end_loc @end_loc end |
#start_loc ⇒ Object (readonly)
Returns the value of attribute start_loc.
9 10 11 |
# File 'lib/ariel/token.rb', line 9 def start_loc @start_loc end |
#text ⇒ Object (readonly)
Returns the value of attribute text.
9 10 11 |
# File 'lib/ariel/token.rb', line 9 def text @text end |
Instance Method Details
#<=>(t) ⇒ Object
Tokens are sorted based on their start_loc
35 36 37 |
# File 'lib/ariel/token.rb', line 35 def <=>(t) @start_loc <=> t.start_loc end |
#==(t) ⇒ Object
Tokens are only equal if they have an equal start_loc, end_loc and text.
30 31 32 |
# File 'lib/ariel/token.rb', line 30 def ==(t) return (@start_loc==t.start_loc && @end_loc==t.end_loc && @text==t.text) end |
#is_label_tag? ⇒ Boolean
Returns true or false depending on whether the token was marked as a label tag when it was initialized.
25 26 27 |
# File 'lib/ariel/token.rb', line 25 def is_label_tag? @label_tag end |
#matches?(landmark) ⇒ Boolean
Accepts either a string a symbol representing a wildcard in Wildcards#list or an an arbitrary regex. Returns true if the whole Token is consumed by the wildcard or the string is equal to Token#text, and false if the match fails. Raises an error if the passed symbol is not a member of Wildcards#list.
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
# File 'lib/ariel/token.rb', line 44 def matches?(landmark) if landmark.kind_of? Symbol or landmark.kind_of? Regexp if landmark.kind_of? Symbol raise ArgumentError, "#{landmark} is not a valid wildcard." unless Wildcards.list.has_key? landmark regex = Wildcards.list[landmark] else regex = landmark end if self.text[regex] == self.text return true else return false end else return true if landmark==self.text end return false end |
#matching_wildcards ⇒ Object
Returns an array of symbols corresponding to the Wildcards that match the Token.
65 66 67 |
# File 'lib/ariel/token.rb', line 65 def matching_wildcards return Wildcards.matching(self.text) end |