CRM114.rb: CRM114 Controllable Regex Mutilator for Ruby
This is a Ruby interface to the CRM114 Controllable Regex Mutilator, an advanced and fast text classifier that uses sparse binary polynomial matching with a Bayesian Chain Rule evaluator and a hidden Markov model to categorize data with up to a 99.87% accuracy.
### About CRM114
Usage
The CRM114 library interface is very similar to that of the [Classifier](rubyforge.org/projects/classifier) project.
Here follows a brief example:
require 'crm114'
crm = Classifier::CRM114.new([:interesting, :boring])
crm.train! :interesting, 'Some data set with a decent signal to noise ratio.'
crm.train! :boring, 'Pig latin, as in lorem ipsum dolor sit amet.'
crm.classify 'Lorem ipsum' => [:boring, 0.99]
crm.interesting? 'Lorem ipsum' => false
crm.boring? 'Lorem ipsum' => true
Have a look at the included unit tests for more comprehensive examples.
Dependencies
Requires the CRM114 binaries to be installed. Specifically, the ‘crm` binary should be accessible in the current user’s ‘PATH` environment variable.
Download
To get a local working copy of the development repository, do:
% git clone git://github.com/bendiken/crm114.git
Alternatively, you can download the latest development version as a tarball as follows:
% wget http://github.com/bendiken/crm114/tarball/master
Installation
The recommended installation method is via RubyGems. To install the latest official release from Gemcutter, do:
% [sudo] gem install crm114
Resources
-
<www.elegantchaos.com/node/129> (crm.py)
Author
-
[Arto Bendiken]([email protected]) - <ar.to/>
License
CRM114.rb is free and unencumbered public domain software. For more information, see <unlicense.org/> or the accompanying UNLICENSE file.