Module: TexterraNLP

Includes:: TexterraNLPSpecs

Included in:: TexterraAPI

Defined in:: lib/ispras-api/texterra/nlp.rb

Constant Summary

Constants included from TexterraNLPSpecs

TexterraNLPSpecs::NLP_SPECS

Instance Method Summary collapse

#disambiguation_annotate(text) ⇒ Array

Detects the most appropriate meanings (concepts) for terms occurred in a given text.
#domain_detection_annotate(text) ⇒ Array

Detects the most appropriate domain for the given text.
#domain_polarity_detection_annotate(text, domain = '') ⇒ Array

Detects whether the given text has positive, negative, or no sentiment, with respect to domain.
#key_concepts_annotate(text) ⇒ Array

Key concepts are the concepts providing short (conceptual) and informative text description.
#language_detection_annotate(text) ⇒ Array

Detects language of given text.
#lemmatization_annotate(text) ⇒ Array

Detects lemma of each word of a given text.
#named_entities_annotate(text) ⇒ Array

Finds all named entities occurences in a given text.
#polarity_detection_annotate(text) ⇒ Array

Detects whether the given text has positive, negative or no sentiment.
#pos_tagging_annotate(text) ⇒ Array

Detects part of speech tag for each word of a given text.
#sentence_detection_annotate(text) ⇒ Array

Detects boundaries of sentences in a given text.
#spelling_correction_annotate(text) ⇒ Array

Tries to correct disprints and other spelling errors in a given text.
#subjectivity_detection_annotate(text) ⇒ Array

Detects whether the given text is subjective or not.
#syntax_detection(text) ⇒ Array

Detects Syntax relations in text.
#term_detection_annotate(text) ⇒ Array

Extracts not overlapping terms within a given text; term is a textual representation for some concept of the real world.
#tokenization_annotate(text) ⇒ Array

Detects all tokens (minimal significant text parts) in a given text.
#tweet_normalization(text) ⇒ Array

Detects Twitter-specific entities: Hashtags, User names, Emoticons, URLs.

Instance Method Details

#disambiguation_annotate(text) ⇒ `Array`

Detects the most appropriate meanings (concepts) for terms occurred in a given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



73
74
75

# File 'lib/ispras-api/texterra/nlp.rb', line 73

def disambiguation_annotate(text)
  preset_nlp(:disambiguation, text)
end

#domain_detection_annotate(text) ⇒ `Array`

Detects the most appropriate domain for the given text. Currently only 2 specific domains are supported: ‘movie’ and ‘politics’ If no domain from this list has been detected, the text is assumed to be no domain, or general domain

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



92
93
94

# File 'lib/ispras-api/texterra/nlp.rb', line 92

def domain_detection_annotate(text)
  preset_nlp(:domainDetection, text)
end

#domain_polarity_detection_annotate(text, domain = '') ⇒ `Array`

Detects whether the given text has positive, negative, or no sentiment, with respect to domain. If domain isn’t provided, Domain detection is applied, this way method tries to achieve best results. If no domain is detected general domain algorithm is applied

Parameters:

text (String) —

Text to process
domain (String) (defaults to: '') —

Domain for polarity detection

Returns:

(Array) —

Texterra annotations

# File 'lib/ispras-api/texterra/nlp.rb', line 119

def domain_polarity_detection_annotate(text, domain = '')
  specs = NLP_SPECS[:domainPolarityDetection]
  domain = "(#{domain})" unless domain.empty?
  result = POST(specs[:path] % domain, specs[:params], text: text)[:nlp_document][:annotations][:i_annotation]
  return [] if result.nil?
  result = [].push result unless result.is_a? Array
  result.map { |e| assign_text(e, text) }
end

#key_concepts_annotate(text) ⇒ `Array`

Key concepts are the concepts providing short (conceptual) and informative text description. This service extracts a set of key concepts for a given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



82
83
84

# File 'lib/ispras-api/texterra/nlp.rb', line 82

def key_concepts_annotate(text)
  preset_nlp(:keyConcepts, text)
end

#language_detection_annotate(text) ⇒ `Array`

Detects language of given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



9
10
11

# File 'lib/ispras-api/texterra/nlp.rb', line 9

def language_detection_annotate(text)
  preset_nlp(:languageDetection, text)
end

#lemmatization_annotate(text) ⇒ `Array`

Detects lemma of each word of a given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



33
34
35

# File 'lib/ispras-api/texterra/nlp.rb', line 33

def lemmatization_annotate(text)
  preset_nlp(:lemmatization, text)
end

#named_entities_annotate(text) ⇒ `Array`

Finds all named entities occurences in a given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



57
58
59

# File 'lib/ispras-api/texterra/nlp.rb', line 57

def named_entities_annotate(text)
  preset_nlp(:namedEntities, text)
end

#polarity_detection_annotate(text) ⇒ `Array`

Detects whether the given text has positive, negative or no sentiment

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



108
109
110

# File 'lib/ispras-api/texterra/nlp.rb', line 108

def polarity_detection_annotate(text)
  preset_nlp(:polarityDetection, text)
end

#pos_tagging_annotate(text) ⇒ `Array`

Detects part of speech tag for each word of a given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



41
42
43

# File 'lib/ispras-api/texterra/nlp.rb', line 41

def pos_tagging_annotate(text)
  preset_nlp(:posTagging, text)
end

#sentence_detection_annotate(text) ⇒ `Array`

Detects boundaries of sentences in a given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



17
18
19

# File 'lib/ispras-api/texterra/nlp.rb', line 17

def sentence_detection_annotate(text)
  preset_nlp(:sentenceDetection, text)
end

#spelling_correction_annotate(text) ⇒ `Array`

Tries to correct disprints and other spelling errors in a given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



49
50
51

# File 'lib/ispras-api/texterra/nlp.rb', line 49

def spelling_correction_annotate(text)
  preset_nlp(:spellingCorrection, text)
end

#subjectivity_detection_annotate(text) ⇒ `Array`

Detects whether the given text is subjective or not

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



100
101
102

# File 'lib/ispras-api/texterra/nlp.rb', line 100

def subjectivity_detection_annotate(text)
  preset_nlp(:subjectivityDetection, text)
end

#syntax_detection(text) ⇒ `Array`

Detects Syntax relations in text. Only works for russian texts

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations

# File 'lib/ispras-api/texterra/nlp.rb', line 141

def syntax_detection(text)
  preset_nlp(:syntaxDetection, text).each do |an|
    an[:value][:parent_token] = assign_text(an[:value][:parent_token], text) if an[:value] && an[:value][:parent_token]
  end
end

#term_detection_annotate(text) ⇒ `Array`

Extracts not overlapping terms within a given text; term is a textual representation for some concept of the real world

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



65
66
67

# File 'lib/ispras-api/texterra/nlp.rb', line 65

def term_detection_annotate(text)
  preset_nlp(:termDetection, text)
end

#tokenization_annotate(text) ⇒ `Array`

Detects all tokens (minimal significant text parts) in a given text

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



25
26
27

# File 'lib/ispras-api/texterra/nlp.rb', line 25

def tokenization_annotate(text)
  preset_nlp(:tokenization, text)
end

#tweet_normalization(text) ⇒ `Array`

Detects Twitter-specific entities: Hashtags, User names, Emoticons, URLs. And also: Stop-words, Misspellings, Spelling suggestions, Spelling corrections

Parameters:

text (String) —

Text to process

Returns:

(Array) —

Texterra annotations



133
134
135

# File 'lib/ispras-api/texterra/nlp.rb', line 133

def tweet_normalization(text)
  preset_nlp(:tweetNormalization, text)
end

Module: TexterraNLP

Constant Summary

Constants included from TexterraNLPSpecs

Instance Method Summary collapse

Instance Method Details

#disambiguation_annotate(text) ⇒ Array

#domain_detection_annotate(text) ⇒ Array

#domain_polarity_detection_annotate(text, domain = '') ⇒ Array

#key_concepts_annotate(text) ⇒ Array

#language_detection_annotate(text) ⇒ Array

#lemmatization_annotate(text) ⇒ Array

#named_entities_annotate(text) ⇒ Array

#polarity_detection_annotate(text) ⇒ Array

#pos_tagging_annotate(text) ⇒ Array

#sentence_detection_annotate(text) ⇒ Array

#spelling_correction_annotate(text) ⇒ Array

#subjectivity_detection_annotate(text) ⇒ Array

#syntax_detection(text) ⇒ Array

#term_detection_annotate(text) ⇒ Array

#tokenization_annotate(text) ⇒ Array

#tweet_normalization(text) ⇒ Array

#disambiguation_annotate(text) ⇒ `Array`

#domain_detection_annotate(text) ⇒ `Array`

#domain_polarity_detection_annotate(text, domain = '') ⇒ `Array`

#key_concepts_annotate(text) ⇒ `Array`

#language_detection_annotate(text) ⇒ `Array`

#lemmatization_annotate(text) ⇒ `Array`

#named_entities_annotate(text) ⇒ `Array`

#polarity_detection_annotate(text) ⇒ `Array`

#pos_tagging_annotate(text) ⇒ `Array`

#sentence_detection_annotate(text) ⇒ `Array`

#spelling_correction_annotate(text) ⇒ `Array`

#subjectivity_detection_annotate(text) ⇒ `Array`

#syntax_detection(text) ⇒ `Array`

#term_detection_annotate(text) ⇒ `Array`

#tokenization_annotate(text) ⇒ `Array`

#tweet_normalization(text) ⇒ `Array`