Module: TexterraNLP

Includes:
TexterraNLPSpecs
Included in:
TexterraAPI
Defined in:
lib/ispras-api/texterra/nlp.rb

Constant Summary

Constants included from TexterraNLPSpecs

TexterraNLPSpecs::NLP_SPECS

Instance Method Summary collapse

Instance Method Details

#disambiguation_annotate(text) ⇒ Array

Detects the most appropriate meanings (concepts) for terms occurred in a given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



73
74
75
# File 'lib/ispras-api/texterra/nlp.rb', line 73

def disambiguation_annotate(text)
  preset_nlp(:disambiguation, text)
end

#domain_detection_annotate(text) ⇒ Array

Detects the most appropriate domain for the given text. Currently only 2 specific domains are supported: ‘movie’ and ‘politics’ If no domain from this list has been detected, the text is assumed to be no domain, or general domain

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



92
93
94
# File 'lib/ispras-api/texterra/nlp.rb', line 92

def domain_detection_annotate(text)
  preset_nlp(:domainDetection, text)
end

#domain_polarity_detection_annotate(text, domain = '') ⇒ Array

Detects whether the given text has positive, negative, or no sentiment, with respect to domain. If domain isn’t provided, Domain detection is applied, this way method tries to achieve best results. If no domain is detected general domain algorithm is applied

Parameters:

  • text (String)

    Text to process

  • domain (String) (defaults to: '')

    Domain for polarity detection

Returns:

  • (Array)

    Texterra annotations



119
120
121
122
123
124
125
126
# File 'lib/ispras-api/texterra/nlp.rb', line 119

def domain_polarity_detection_annotate(text, domain = '')
  specs = NLP_SPECS[:domainPolarityDetection]
  domain = "(#{domain})" unless domain.empty?
  result = POST(specs[:path] % domain, specs[:params], text: text)[:nlp_document][:annotations][:i_annotation]
  return [] if result.nil?
  result = [].push result unless result.is_a? Array
  result.map { |e| assign_text(e, text) }
end

#key_concepts_annotate(text) ⇒ Array

Key concepts are the concepts providing short (conceptual) and informative text description. This service extracts a set of key concepts for a given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



82
83
84
# File 'lib/ispras-api/texterra/nlp.rb', line 82

def key_concepts_annotate(text)
  preset_nlp(:keyConcepts, text)
end

#language_detection_annotate(text) ⇒ Array

Detects language of given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



9
10
11
# File 'lib/ispras-api/texterra/nlp.rb', line 9

def language_detection_annotate(text)
  preset_nlp(:languageDetection, text)
end

#lemmatization_annotate(text) ⇒ Array

Detects lemma of each word of a given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



33
34
35
# File 'lib/ispras-api/texterra/nlp.rb', line 33

def lemmatization_annotate(text)
  preset_nlp(:lemmatization, text)
end

#named_entities_annotate(text) ⇒ Array

Finds all named entities occurences in a given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



57
58
59
# File 'lib/ispras-api/texterra/nlp.rb', line 57

def named_entities_annotate(text)
  preset_nlp(:namedEntities, text)
end

#polarity_detection_annotate(text) ⇒ Array

Detects whether the given text has positive, negative or no sentiment

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



108
109
110
# File 'lib/ispras-api/texterra/nlp.rb', line 108

def polarity_detection_annotate(text)
  preset_nlp(:polarityDetection, text)
end

#pos_tagging_annotate(text) ⇒ Array

Detects part of speech tag for each word of a given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



41
42
43
# File 'lib/ispras-api/texterra/nlp.rb', line 41

def pos_tagging_annotate(text)
  preset_nlp(:posTagging, text)
end

#sentence_detection_annotate(text) ⇒ Array

Detects boundaries of sentences in a given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



17
18
19
# File 'lib/ispras-api/texterra/nlp.rb', line 17

def sentence_detection_annotate(text)
  preset_nlp(:sentenceDetection, text)
end

#spelling_correction_annotate(text) ⇒ Array

Tries to correct disprints and other spelling errors in a given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



49
50
51
# File 'lib/ispras-api/texterra/nlp.rb', line 49

def spelling_correction_annotate(text)
  preset_nlp(:spellingCorrection, text)
end

#subjectivity_detection_annotate(text) ⇒ Array

Detects whether the given text is subjective or not

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



100
101
102
# File 'lib/ispras-api/texterra/nlp.rb', line 100

def subjectivity_detection_annotate(text)
  preset_nlp(:subjectivityDetection, text)
end

#syntax_detection(text) ⇒ Array

Detects Syntax relations in text. Only works for russian texts

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



141
142
143
144
145
# File 'lib/ispras-api/texterra/nlp.rb', line 141

def syntax_detection(text)
  preset_nlp(:syntaxDetection, text).each do |an|
    an[:value][:parent_token] = assign_text(an[:value][:parent_token], text) if an[:value] && an[:value][:parent_token]
  end
end

#term_detection_annotate(text) ⇒ Array

Extracts not overlapping terms within a given text; term is a textual representation for some concept of the real world

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



65
66
67
# File 'lib/ispras-api/texterra/nlp.rb', line 65

def term_detection_annotate(text)
  preset_nlp(:termDetection, text)
end

#tokenization_annotate(text) ⇒ Array

Detects all tokens (minimal significant text parts) in a given text

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



25
26
27
# File 'lib/ispras-api/texterra/nlp.rb', line 25

def tokenization_annotate(text)
  preset_nlp(:tokenization, text)
end

#tweet_normalization(text) ⇒ Array

Detects Twitter-specific entities: Hashtags, User names, Emoticons, URLs. And also: Stop-words, Misspellings, Spelling suggestions, Spelling corrections

Parameters:

  • text (String)

    Text to process

Returns:

  • (Array)

    Texterra annotations



133
134
135
# File 'lib/ispras-api/texterra/nlp.rb', line 133

def tweet_normalization(text)
  preset_nlp(:tweetNormalization, text)
end