Class: Spacy::Token

Inherits:

Object

Object
Spacy::Token

show all

Defined in:: lib/ruby-spacy.rb

Overview

See also spaCy Python API document for [‘Token`](spacy.io/api/token).

Instance Attribute Summary collapse

#py_token ⇒ Object readonly

A Python ‘Token` instance accessible via `PyCall`.
#text ⇒ String readonly

A string representing the token.

Instance Method Summary collapse

#ancestors ⇒ Array<Token>

Returns the token’s ancestors.
#children ⇒ Array<Token>

Returns a sequence of the token’s immediate syntactic children.
#dep ⇒ String

Returns the dependency relation by calling ‘dep_’ of ‘@py_token` object.
#ent_type ⇒ String

Returns the named entity type by calling ‘ent_type_’ of ‘@py_token` object.
#head ⇒ Token

Returns the head token.
#initialize(py_token) ⇒ Token constructor

It is recommended to use Doc#tokens or Span#tokens methods to create tokens.
#lang ⇒ String

Returns the language by calling ‘lang_’ of ‘@py_token` object.
#lefts ⇒ Array<Token>

The leftward immediate children of the word in the syntactic dependency parse.
#lemma ⇒ String

Returns the lemma by calling ‘lemma_’ of ‘@py_token` object.
#lexeme ⇒ Lexeme

Returns a lexeme object.
#lower ⇒ String

Returns the lowercase form by calling ‘lower_’ of ‘@py_token` object.
#method_missing(name, *args) ⇒ Object

Methods defined in Python but not wrapped in ruby-spacy can be called by this dynamic method handling mechanism.
#morphology(hash: true) ⇒ Hash, String

Returns a hash or string of morphological information.
#pos ⇒ String

Returns the pos by calling ‘pos_’ of ‘@py_token` object.
#respond_to_missing?(sym) ⇒ Boolean
#rights ⇒ Array<Token>

The rightward immediate children of the word in the syntactic dependency parse.
#shape ⇒ String

Returns the shape (e.g. “Xxxxx”) by calling ‘shape_’ of ‘@py_token` object.
#subtree ⇒ Array<Token>

Returns the token in question and the tokens that descend from it.
#tag ⇒ String

Returns the fine-grained pos by calling ‘tag_’ of ‘@py_token` object.
#to_s ⇒ String

String representation of the token.
#whitespace ⇒ String

Returns the trailing space character if present by calling ‘whitespace_’ of ‘@py_token` object.

Constructor Details

#initialize(py_token) ⇒ `Token`

It is recommended to use Doc#tokens or Span#tokens methods to create tokens. There is no way to generate a token from scratch but relying on a pre-exising Python ‘Token` object.

Parameters:

py_token (Object) —

Python ‘Token` object

# File 'lib/ruby-spacy.rb', line 698

def initialize(py_token)
  @py_token = py_token
  @text = @py_token.text
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(name, *args) ⇒ `Object`

Methods defined in Python but not wrapped in ruby-spacy can be called by this dynamic method handling mechanism.



844
845
846

# File 'lib/ruby-spacy.rb', line 844

def method_missing(name, *args)
  @py_token.send(name, *args)
end

Instance Attribute Details

#py_token ⇒ `Object` (readonly)

Returns a Python ‘Token` instance accessible via `PyCall`.

Returns:

(Object) —

a Python ‘Token` instance accessible via `PyCall`



690
691
692

# File 'lib/ruby-spacy.rb', line 690

def py_token
  @py_token
end

#text ⇒ `String` (readonly)

Returns a string representing the token.

Returns:

(String) —

a string representing the token



693
694
695

# File 'lib/ruby-spacy.rb', line 693

def text
  @text
end

Instance Method Details

#ancestors ⇒ `Array<Token>`

Returns the token’s ancestors.

Returns:

(Array<Token>) —

an array of tokens

# File 'lib/ruby-spacy.rb', line 721

def ancestors
  ancestor_array = []
  PyCall::List.call(@py_token.ancestors).each do |ancestor|
    ancestor_array << Token.new(ancestor)
  end
  ancestor_array
end

#children ⇒ `Array<Token>`

Returns a sequence of the token’s immediate syntactic children.

Returns:

(Array<Token>) —

an array of tokens

# File 'lib/ruby-spacy.rb', line 731

def children
  child_array = []
  PyCall::List.call(@py_token.children).each do |child|
    child_array << Token.new(child)
  end
  child_array
end

#dep ⇒ `String`

Returns the dependency relation by calling ‘dep_’ of ‘@py_token` object

Returns:

(String)



815
816
817

# File 'lib/ruby-spacy.rb', line 815

def dep
  @py_token.dep_
end

#ent_type ⇒ `String`

Returns the named entity type by calling ‘ent_type_’ of ‘@py_token` object

Returns:

(String)



833
834
835

# File 'lib/ruby-spacy.rb', line 833

def ent_type
  @py_token.ent_type_
end

#head ⇒ `Token`

Returns the head token

Returns:

(Token)



705
706
707

# File 'lib/ruby-spacy.rb', line 705

def head
  Token.new(@py_token.head)
end

#lang ⇒ `String`

Returns the language by calling ‘lang_’ of ‘@py_token` object

Returns:

(String)



821
822
823

# File 'lib/ruby-spacy.rb', line 821

def lang
  @py_token.lang_
end

#lefts ⇒ `Array<Token>`

The leftward immediate children of the word in the syntactic dependency parse.

Returns:

(Array<Token>) —

an array of tokens

# File 'lib/ruby-spacy.rb', line 741

def lefts
  token_array = []
  PyCall::List.call(@py_token.lefts).each do |token|
    token_array << Token.new(token)
  end
  token_array
end

#lemma ⇒ `String`

Returns the lemma by calling ‘lemma_’ of ‘@py_token` object

Returns:

(String)



785
786
787

# File 'lib/ruby-spacy.rb', line 785

def lemma
  @py_token.lemma_
end

#lexeme ⇒ `Lexeme`

Returns a lexeme object

Returns:

(Lexeme)



839
840
841

# File 'lib/ruby-spacy.rb', line 839

def lexeme
  Lexeme.new(@py_token.lex)
end

#lower ⇒ `String`

Returns the lowercase form by calling ‘lower_’ of ‘@py_token` object

Returns:

(String)



791
792
793

# File 'lib/ruby-spacy.rb', line 791

def lower
  @py_token.lower_
end

#morphology(hash: true) ⇒ `Hash`, `String`

Returns a hash or string of morphological information

Parameters:

hash (Boolean) (defaults to: true) —

if true, a hash will be returned instead of a string

Returns:

(Hash, String)

# File 'lib/ruby-spacy.rb', line 768

def morphology(hash: true)
  if @py_token.has_morph
    morph_analysis = @py_token.morph
    if hash
      morph_analysis.to_dict
    else
      morph_analysis.to_s
    end
  elsif hash
    {}
  else
    ""
  end
end

#pos ⇒ `String`

Returns the pos by calling ‘pos_’ of ‘@py_token` object

Returns:

(String)



803
804
805

# File 'lib/ruby-spacy.rb', line 803

def pos
  @py_token.pos_
end

#respond_to_missing?(sym) ⇒ `Boolean`

Returns:

(Boolean)



848
849
850

# File 'lib/ruby-spacy.rb', line 848

def respond_to_missing?(sym)
  sym ? true : super
end

#rights ⇒ `Array<Token>`

The rightward immediate children of the word in the syntactic dependency parse.

Returns:

(Array<Token>) —

an array of tokens

# File 'lib/ruby-spacy.rb', line 751

def rights
  token_array = []
  PyCall::List.call(@py_token.rights).each do |token|
    token_array << Token.new(token)
  end
  token_array
end

#shape ⇒ `String`

Returns the shape (e.g. “Xxxxx”) by calling ‘shape_’ of ‘@py_token` object

Returns:

(String)



797
798
799

# File 'lib/ruby-spacy.rb', line 797

def shape
  @py_token.shape_
end

#subtree ⇒ `Array<Token>`

Returns the token in question and the tokens that descend from it.

Returns:

(Array<Token>) —

an array of tokens

# File 'lib/ruby-spacy.rb', line 711

def subtree
  descendant_array = []
  PyCall::List.call(@py_token.subtree).each do |descendant|
    descendant_array << Token.new(descendant)
  end
  descendant_array
end

#tag ⇒ `String`

Returns the fine-grained pos by calling ‘tag_’ of ‘@py_token` object

Returns:

(String)



809
810
811

# File 'lib/ruby-spacy.rb', line 809

def tag
  @py_token.tag_
end

#to_s ⇒ `String`

String representation of the token.

Returns:

(String)



761
762
763

# File 'lib/ruby-spacy.rb', line 761

def to_s
  @text
end

#whitespace ⇒ `String`

Returns the trailing space character if present by calling ‘whitespace_’ of ‘@py_token` object

Returns:

(String)



827
828
829

# File 'lib/ruby-spacy.rb', line 827

def whitespace
  @py_token.whitespace_
end

Class: Spacy::Token

Overview

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(py_token) ⇒ Token

Dynamic Method Handling

#method_missing(name, *args) ⇒ Object

Instance Attribute Details

#py_token ⇒ Object (readonly)

#text ⇒ String (readonly)

Instance Method Details

#ancestors ⇒ Array<Token>

#children ⇒ Array<Token>

#dep ⇒ String

#ent_type ⇒ String

#head ⇒ Token

#lang ⇒ String

#lefts ⇒ Array<Token>

#lemma ⇒ String

#lexeme ⇒ Lexeme

#lower ⇒ String

#morphology(hash: true) ⇒ Hash, String

#pos ⇒ String

#respond_to_missing?(sym) ⇒ Boolean

#rights ⇒ Array<Token>

#shape ⇒ String

#subtree ⇒ Array<Token>

#tag ⇒ String

#to_s ⇒ String

#whitespace ⇒ String

#initialize(py_token) ⇒ `Token`

#method_missing(name, *args) ⇒ `Object`

#py_token ⇒ `Object` (readonly)

#text ⇒ `String` (readonly)

#ancestors ⇒ `Array<Token>`

#children ⇒ `Array<Token>`

#dep ⇒ `String`

#ent_type ⇒ `String`

#head ⇒ `Token`

#lang ⇒ `String`

#lefts ⇒ `Array<Token>`

#lemma ⇒ `String`

#lexeme ⇒ `Lexeme`

#lower ⇒ `String`

#morphology(hash: true) ⇒ `Hash`, `String`

#pos ⇒ `String`

#respond_to_missing?(sym) ⇒ `Boolean`

#rights ⇒ `Array<Token>`

#shape ⇒ `String`

#subtree ⇒ `Array<Token>`

#tag ⇒ `String`

#to_s ⇒ `String`

#whitespace ⇒ `String`