Class: Spacy::PhraseMatcher

Inherits:
Object
  • Object
show all
Defined in:
lib/ruby-spacy.rb

Overview

See also spaCy Python API document for [PhraseMatcher](spacy.io/api/phrasematcher). PhraseMatcher is useful for efficiently matching large terminology lists. It’s faster than Matcher when matching many phrase patterns.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(nlp, attr: "ORTH") ⇒ PhraseMatcher

Creates a Spacy::PhraseMatcher instance.

Examples:

Case-insensitive matching

matcher = Spacy::PhraseMatcher.new(nlp, attr: "LOWER")


719
720
721
722
# File 'lib/ruby-spacy.rb', line 719

def initialize(nlp, attr: "ORTH")
  @nlp = nlp
  @py_matcher = PyPhraseMatcher.call(nlp.py_nlp.vocab, attr: attr)
end

Instance Attribute Details

#nlpLanguage (readonly)



711
712
713
# File 'lib/ruby-spacy.rb', line 711

def nlp
  @nlp
end

#py_matcherObject (readonly)



708
709
710
# File 'lib/ruby-spacy.rb', line 708

def py_matcher
  @py_matcher
end

Instance Method Details

#add(label, phrases) ⇒ Object

Adds phrase patterns to the matcher.

Examples:

Add product names

matcher.add("PRODUCT", ["iPhone", "MacBook Pro", "iPad"])


729
730
731
732
# File 'lib/ruby-spacy.rb', line 729

def add(label, phrases)
  patterns = phrases.map { |phrase| @nlp.py_nlp.call(phrase) }
  @py_matcher.add(label, patterns)
end

#match(doc) ⇒ Array<Span>

Execute the phrase match and return matching spans.

Examples:

Find matches

matches = matcher.match(doc)
matches.each { |span| puts "#{span.text} => #{span.label}" }


740
741
742
743
# File 'lib/ruby-spacy.rb', line 740

def match(doc)
  py_matches = @py_matcher.call(doc.py_doc, as_spans: true)
  PyCall::List.call(py_matches).map { |py_span| Span.new(doc, py_span: py_span) }
end