Class: Spacy::PhraseMatcher

Inherits:
Object
  • Object
show all
Defined in:
lib/ruby-spacy.rb

Overview

See also spaCy Python API document for [‘PhraseMatcher`](spacy.io/api/phrasematcher). PhraseMatcher is useful for efficiently matching large terminology lists. It’s faster than Matcher when matching many phrase patterns.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(nlp, attr: "ORTH") ⇒ PhraseMatcher

Creates a Spacy::PhraseMatcher instance.

Examples:

Case-insensitive matching

matcher = Spacy::PhraseMatcher.new(nlp, attr: "LOWER")

Parameters:

  • nlp (Language)

    an instance of Language class

  • attr (String) (defaults to: "ORTH")

    the token attribute to match on (default: “ORTH”). Use “LOWER” for case-insensitive matching.



601
602
603
604
# File 'lib/ruby-spacy.rb', line 601

def initialize(nlp, attr: "ORTH")
  @nlp = nlp
  @py_matcher = PyPhraseMatcher.call(nlp.py_nlp.vocab, attr: attr)
end

Instance Attribute Details

#nlpLanguage (readonly)

Returns the language model used by this matcher.

Returns:

  • (Language)

    the language model used by this matcher



593
594
595
# File 'lib/ruby-spacy.rb', line 593

def nlp
  @nlp
end

#py_matcherObject (readonly)

Returns a Python ‘PhraseMatcher` instance accessible via `PyCall`.

Returns:

  • (Object)

    a Python ‘PhraseMatcher` instance accessible via `PyCall`



590
591
592
# File 'lib/ruby-spacy.rb', line 590

def py_matcher
  @py_matcher
end

Instance Method Details

#add(label, phrases) ⇒ Object

Adds phrase patterns to the matcher.

Examples:

Add product names

matcher.add("PRODUCT", ["iPhone", "MacBook Pro", "iPad"])

Parameters:

  • label (String)

    a label string given to the patterns

  • phrases (Array<String>)

    an array of phrase strings to match



611
612
613
614
# File 'lib/ruby-spacy.rb', line 611

def add(label, phrases)
  patterns = phrases.map { |phrase| @nlp.py_nlp.call(phrase) }
  @py_matcher.add(label, patterns)
end

#match(doc) ⇒ Array<Span>

Execute the phrase match and return matching spans.

Examples:

Find matches

matches = matcher.match(doc)
matches.each { |span| puts "#{span.text} => #{span.label}" }

Parameters:

  • doc (Doc)

    a Doc instance to search

Returns:

  • (Array<Span>)

    an array of Span objects with labels



622
623
624
625
626
627
628
629
630
# File 'lib/ruby-spacy.rb', line 622

def match(doc)
  py_matches = @py_matcher.call(doc.py_doc, as_spans: true)
  results = []
  PyCall::List.call(py_matches).each do |py_span|
    span = Span.new(doc, py_span: py_span)
    results << span
  end
  results
end