Class: Spacy::Token

Inherits:
Object
  • Object
show all
Defined in:
lib/ruby-spacy.rb

Overview

See also spaCy Python API document for [Token](spacy.io/api/token).

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(py_token) ⇒ Token

It is recommended to use Doc#tokens or Span#tokens methods to create tokens. There is no way to generate a token from scratch but relying on a pre-exising Python Token object.

Parameters:

  • py_token (Object)

    Python Token object



911
912
913
914
# File 'lib/ruby-spacy.rb', line 911

def initialize(py_token)
  @py_token = py_token
  @text = @py_token.text
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(name, *args) ⇒ Object

Methods defined in Python but not wrapped in ruby-spacy can be called by this dynamic method handling mechanism.



1043
1044
1045
# File 'lib/ruby-spacy.rb', line 1043

def method_missing(name, *args)
  @py_token.send(name, *args)
end

Instance Attribute Details

#py_tokenObject (readonly)

Returns a Python Token instance accessible via PyCall.

Returns:

  • (Object)

    a Python Token instance accessible via PyCall



903
904
905
# File 'lib/ruby-spacy.rb', line 903

def py_token
  @py_token
end

#textString (readonly)

Returns a string representing the token.

Returns:

  • (String)

    a string representing the token



906
907
908
# File 'lib/ruby-spacy.rb', line 906

def text
  @text
end

Instance Method Details

#ancestorsArray<Token>

Returns the token’s ancestors.

Returns:

  • (Array<Token>)

    an array of tokens



936
937
938
# File 'lib/ruby-spacy.rb', line 936

def ancestors
  PyCall::List.call(@py_token.ancestors).map { |ancestor| Token.new(ancestor) }
end

#childrenArray<Token>

Returns a sequence of the token’s immediate syntactic children.

Returns:

  • (Array<Token>)

    an array of tokens



942
943
944
# File 'lib/ruby-spacy.rb', line 942

def children
  PyCall::List.call(@py_token.children).map { |child| Token.new(child) }
end

#depString

Returns the dependency relation by calling ‘dep_’ of ‘@py_token` object

Returns:

  • (String)


1014
1015
1016
# File 'lib/ruby-spacy.rb', line 1014

def dep
  @py_token.dep_
end

#ent_typeString

Returns the named entity type by calling ‘ent_type_’ of ‘@py_token` object

Returns:

  • (String)


1032
1033
1034
# File 'lib/ruby-spacy.rb', line 1032

def ent_type
  @py_token.ent_type_
end

#headToken

Returns the head token

Returns:



924
925
926
# File 'lib/ruby-spacy.rb', line 924

def head
  Token.new(@py_token.head)
end

#idxInteger

Returns the character offset of the token within the parent document.

Returns:

  • (Integer)


918
919
920
# File 'lib/ruby-spacy.rb', line 918

def idx
  @py_token.idx
end

#instance_variables_to_inspectObject



1051
1052
1053
# File 'lib/ruby-spacy.rb', line 1051

def instance_variables_to_inspect
  [:@text]
end

#langString

Returns the language by calling ‘lang_’ of ‘@py_token` object

Returns:

  • (String)


1020
1021
1022
# File 'lib/ruby-spacy.rb', line 1020

def lang
  @py_token.lang_
end

#leftsArray<Token>

The leftward immediate children of the word in the syntactic dependency parse.

Returns:

  • (Array<Token>)

    an array of tokens



948
949
950
# File 'lib/ruby-spacy.rb', line 948

def lefts
  PyCall::List.call(@py_token.lefts).map { |token| Token.new(token) }
end

#lemmaString

Returns the lemma by calling ‘lemma_’ of ‘@py_token` object

Returns:

  • (String)


984
985
986
# File 'lib/ruby-spacy.rb', line 984

def lemma
  @py_token.lemma_
end

#lexemeLexeme

Returns a lexeme object

Returns:



1038
1039
1040
# File 'lib/ruby-spacy.rb', line 1038

def lexeme
  Lexeme.new(@py_token.lex)
end

#lowerString

Returns the lowercase form by calling ‘lower_’ of ‘@py_token` object

Returns:

  • (String)


990
991
992
# File 'lib/ruby-spacy.rb', line 990

def lower
  @py_token.lower_
end

#morphology(hash: true) ⇒ Hash, String

Returns a hash or string of morphological information

Parameters:

  • hash (Boolean) (defaults to: true)

    if true, a hash will be returned instead of a string

Returns:

  • (Hash, String)


967
968
969
970
971
972
973
974
975
976
977
978
979
980
# File 'lib/ruby-spacy.rb', line 967

def morphology(hash: true)
  if @py_token.has_morph
    morph_analysis = @py_token.morph
    if hash
      morph_analysis.to_dict
    else
      morph_analysis.to_s
    end
  elsif hash
    {}
  else
    ""
  end
end

#posString

Returns the pos by calling ‘pos_’ of ‘@py_token` object

Returns:

  • (String)


1002
1003
1004
# File 'lib/ruby-spacy.rb', line 1002

def pos
  @py_token.pos_
end

#respond_to_missing?(sym, include_private = false) ⇒ Boolean

Returns:

  • (Boolean)


1047
1048
1049
# File 'lib/ruby-spacy.rb', line 1047

def respond_to_missing?(sym, include_private = false)
  Spacy.py_hasattr?(@py_token, sym) || super
end

#rightsArray<Token>

The rightward immediate children of the word in the syntactic dependency parse.

Returns:

  • (Array<Token>)

    an array of tokens



954
955
956
# File 'lib/ruby-spacy.rb', line 954

def rights
  PyCall::List.call(@py_token.rights).map { |token| Token.new(token) }
end

#shapeString

Returns the shape (e.g. “Xxxxx”) by calling ‘shape_’ of ‘@py_token` object

Returns:

  • (String)


996
997
998
# File 'lib/ruby-spacy.rb', line 996

def shape
  @py_token.shape_
end

#subtreeArray<Token>

Returns the token in question and the tokens that descend from it.

Returns:

  • (Array<Token>)

    an array of tokens



930
931
932
# File 'lib/ruby-spacy.rb', line 930

def subtree
  PyCall::List.call(@py_token.subtree).map { |descendant| Token.new(descendant) }
end

#tagString

Returns the fine-grained pos by calling ‘tag_’ of ‘@py_token` object

Returns:

  • (String)


1008
1009
1010
# File 'lib/ruby-spacy.rb', line 1008

def tag
  @py_token.tag_
end

#to_sString

String representation of the token.

Returns:

  • (String)


960
961
962
# File 'lib/ruby-spacy.rb', line 960

def to_s
  @text
end

#whitespaceString

Returns the trailing space character if present by calling ‘whitespace_’ of ‘@py_token` object

Returns:

  • (String)


1026
1027
1028
# File 'lib/ruby-spacy.rb', line 1026

def whitespace
  @py_token.whitespace_
end