Class: Ariel::StructureNode
- Inherits:
-
Object
- Object
- Ariel::StructureNode
- Includes:
- NodeLike
- Defined in:
- lib/ariel/structure_node.rb
Overview
Implements a Node object used to represent the structure of the document tree. Each node stores start and end rules to extract the desired content from its parent node. Could be viewed as a rule-storing object.
Instance Attribute Summary collapse
-
#ruleset ⇒ Object
Returns the value of attribute ruleset.
Attributes included from NodeLike
Instance Method Summary collapse
-
#apply_extraction_tree_on(root_node, extract_labels = false) ⇒ Object
Applies the extraction rules stored in the current StructureNode and all its descendant children.
-
#extend_structure {|_self| ... } ⇒ Object
Used to extend an already created Node.
-
#extract_from(node) ⇒ Object
Given a Node to apply it’s rules to, this function will create a new node and add it as a child of the given node.
-
#initialize(name = :root, type = :not_list) {|_self| ... } ⇒ StructureNode
constructor
A new instance of StructureNode.
- #item(name, &block) ⇒ Object
- #list_item(name, &block) ⇒ Object
- #method_missing(method, *args, &block) ⇒ Object
Methods included from NodeLike
Constructor Details
#initialize(name = :root, type = :not_list) {|_self| ... } ⇒ StructureNode
Returns a new instance of StructureNode.
10 11 12 13 14 |
# File 'lib/ariel/structure_node.rb', line 10 def initialize(name=:root, type=:not_list, &block) @children={} @meta = OpenStruct.new({:name=>name, :node_type=>type}) yield self if block_given? end |
Dynamic Method Handling
This class handles dynamic methods through the method_missing method
#method_missing(method, *args, &block) ⇒ Object
66 67 68 69 70 71 72 |
# File 'lib/ariel/structure_node.rb', line 66 def method_missing(method, *args, &block) if @children.has_key? method @children[method] else super end end |
Instance Attribute Details
#ruleset ⇒ Object
Returns the value of attribute ruleset.
9 10 11 |
# File 'lib/ariel/structure_node.rb', line 9 def ruleset @ruleset end |
Instance Method Details
#apply_extraction_tree_on(root_node, extract_labels = false) ⇒ Object
Applies the extraction rules stored in the current StructureNode and all its descendant children.
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
# File 'lib/ariel/structure_node.rb', line 42 def apply_extraction_tree_on(root_node, extract_labels=false) extraction_queue = [root_node] until extraction_queue.empty? do new_parent = extraction_queue.shift new_parent..structure.children.values.each do |child| if extract_labels extracted_node=LabelUtils.extract_labeled_region(child, new_parent) else extracted_node=child.extract_from(new_parent) end extraction_queue.push(extracted_node) if extracted_node end end return root_node end |
#extend_structure {|_self| ... } ⇒ Object
Used to extend an already created Node. e.g.
node.extend_structure do |r|
r.new_field1
r.new_field2
end
21 22 23 |
# File 'lib/ariel/structure_node.rb', line 21 def extend_structure(&block) yield self if block_given? end |
#extract_from(node) ⇒ Object
Given a Node to apply it’s rules to, this function will create a new node and add it as a child of the given node. For StructureNodes of :list type, the list is extracted and so are each of the list items. In this case, only the list items are yielded.
29 30 31 32 33 34 35 36 37 38 |
# File 'lib/ariel/structure_node.rb', line 29 def extract_from(node) # Will be reimplemented to return an array of extracted items newstream = @ruleset.apply_to(node.tokenstream) extracted_node = ExtractedNode.new(.name, newstream, self) node.add_child extracted_node if newstream if self..node_type == :list #Do stuff end return extracted_node end |
#item(name, &block) ⇒ Object
58 59 60 |
# File 'lib/ariel/structure_node.rb', line 58 def item(name, &block) self.add_child(StructureNode.new(name, &block)) end |
#list_item(name, &block) ⇒ Object
62 63 64 |
# File 'lib/ariel/structure_node.rb', line 62 def list_item(name, &block) self.add_child(StructureNode.new(name, :list, &block)) end |