Class: Classifier::CRM114
- Inherits:
-
Object
- Object
- Classifier::CRM114
- Defined in:
- lib/crm114.rb,
lib/crm114/version.rb
Defined Under Namespace
Modules: VERSION
Constant Summary collapse
- CLASSIFICATION_TYPE =
'<osb unique microgroom>'
- FILE_EXTENSION =
'.css'
- CMD_CRM =
'/usr/bin/env crm'
- OPT_LEARN =
'-{ learn %s ( %s ) }'
- OPT_CLASSIFY =
'-{ isolate (:stats:); classify %s ( %s ) (:stats:); match [:stats:] (:: :best: :prob:) /Best match to file .. \\(%s\\/([[:graph:]]+)\\%s\\) prob: ([0-9.]+)/; output /:*:best:\\t:*:prob:/ }'
Class Method Summary collapse
-
.version ⇒ String?
Returns a string containg the installed CRM114 engine version in a format such as “20060118-BlameTheReavers”.
Instance Method Summary collapse
-
#classify(text = nil, &block) ⇒ Array(Symbol, Float)
Returns the classification of the provided text as a tuple containing the highest-probability category and a confidence indicator in the range of 0.5..1.0.
-
#initialize(categories, options = {}) ⇒ CRM114
constructor
Returns a new CRM114 classifier defined by the given categories.
-
#learn!(category, text, &block) ⇒ void
(also: #train!)
Trains the classifier to consider the given text to be a sample from the set named by category.
-
#method_missing(symbol, *args) ⇒ Object
:nodoc:.
- #unlearn!(category, text, &block) ⇒ void (also: #untrain!)
Constructor Details
#initialize(categories, options = {}) ⇒ CRM114
Returns a new CRM114 classifier defined by the given categories.
25 26 27 28 29 |
# File 'lib/crm114.rb', line 25 def initialize(categories, = {}) @categories = categories.to_a.collect { |category| category.to_s.to_sym } @path = File.([:path] || '.') @debug = [:debug] || false end |
Dynamic Method Handling
This class handles dynamic methods through the method_missing method
#method_missing(symbol, *args) ⇒ Object
:nodoc:
76 77 78 79 80 81 82 83 84 85 86 |
# File 'lib/crm114.rb', line 76 def method_missing(symbol, *args) # :nodoc: case symbol.to_s[-1] when ?! category = symbol.to_s.chop.to_sym return learn!(category, *args) if @categories.include?(category) when ?? # it's a predicate category = symbol.to_s.chop.to_sym return classify(*args).first == category if @categories.include?(category) end super end |
Class Method Details
.version ⇒ String?
Returns a string containg the installed CRM114 engine version in a format such as “20060118-BlameTheReavers”.
16 17 18 |
# File 'lib/crm114.rb', line 16 def self.version $1 if IO.popen(CMD_CRM + ' -v', 'r') { |pipe| pipe.readline } =~ /CRM114, version ([\d\w\-\.]+)/ end |
Instance Method Details
#classify(text = nil, &block) ⇒ Array(Symbol, Float)
Returns the classification of the provided text as a tuple containing the highest-probability category and a confidence indicator in the range of 0.5..1.0.
62 63 64 65 66 67 68 69 70 71 72 73 74 |
# File 'lib/crm114.rb', line 62 def classify(text = nil, &block) files = @categories.collect { |category| css_file_path(category) } cmd = CMD_CRM + " '" + (OPT_CLASSIFY % [CLASSIFICATION_TYPE, files.join(' '), @path.gsub(/\//, '\/'), FILE_EXTENSION]) + "'" puts cmd if @debug result = IO.popen(cmd, 'r+') do |pipe| block_given? ? block.call(pipe) : pipe.write(text) pipe.close_write pipe.readline unless pipe.closed? || pipe.eof? end return [nil, 0.0] unless result && result.include?("\t") result = result.split("\t") [result.first.to_sym, result.last.to_f] end |
#learn!(category, text, &block) ⇒ void Also known as: train!
This method returns an undefined value.
Trains the classifier to consider the given text to be a sample from the set named by category.
38 39 40 41 42 |
# File 'lib/crm114.rb', line 38 def learn!(category, text, &block) cmd = CMD_CRM + " '" + (OPT_LEARN % [CLASSIFICATION_TYPE, css_file_path(category)]) + "'" puts cmd if @debug IO.popen(cmd, 'w') { |pipe| block_given? ? block.call(pipe) : pipe.write(text) } end |
#unlearn!(category, text, &block) ⇒ void Also known as: untrain!
This method returns an undefined value.
49 50 51 |
# File 'lib/crm114.rb', line 49 def unlearn!(category, text, &block) # :nodoc: raise NotImplementedError.new('unlearning not supported at present') end |