Class: Nimono::Cabocha
- Inherits:
-
Object
- Object
- Nimono::Cabocha
- Includes:
- CabochaLib, OptionParse
- Defined in:
- lib/nimono/nimono.rb
Overview
‘Cabocha` is a class providing an interface to the CaboCha library. In this class the arguments supported by CaboCha can be used in almost the same way.
Constant Summary
Constants included from CabochaLib
Nimono::CabochaLib::CABOCHA_PATH
Constants included from OptionParse
Instance Attribute Summary collapse
-
#chunks ⇒ Array
readonly
Array of chunk.
-
#libpath ⇒ String
readonly
Absolute file path to CaboCha library.
-
#options ⇒ Hash
readonly
CaboCha options as Key-Value pairs.
-
#tokens ⇒ Array
readonly
Array of Token.
Instance Method Summary collapse
-
#initialize(options = {}) ⇒ Cabocha
constructor
Initializes the CaboCha with the given ‘options’.
-
#parse(text) ⇒ String
Parses the given ‘text`, returning the CaboCha output as a string.
-
#to_s ⇒ String
The result of parsing Japanese text.
Methods included from CabochaLib
Methods included from OptionParse
Constructor Details
#initialize(options = {}) ⇒ Cabocha
Initializes the CaboCha with the given ‘options’. options is given as a string (CaboCha command line arguments) or as a Ruby-style hash.
Options supported are:
-
:output_format
-
:input_layer
-
:output_layer
-
:ne
-
:parser_model
-
:chunker_model
-
:ne_model
-
:posset
-
:charset
-
:charset_file
-
:rcfile
-
:mecabrc
-
:mecab_dicdir
-
:mecab_userdic
-
:output
<p>CaboCha command line arguments (-f1) or long (–output-format=1) may be used in addition ot Ruby-style hashs</p>
e.g.<br />
require 'nimono'
nc = Nimono::Cabocha.new(output_format: 1)
or nc = Nimono::Cabocha.new('-f1')
=> #<Nimono::Cabocha:0x6364e48d
@sparse_tostr=#<Proc:0x74d917f5@/home/foo/nimono/lib/nimono/nimono.rb:54 (lambda)>,
@libpath="/usr/local/lib/libcabocha.so",
@options={:output_format=>1},
@tree=#<FFI::Pointer address=0x7f6ecc2e3790>,
@parser=#<FFI::Pointer address=0x7f6ecc2e3830>>
puts nc.parse('太郎は花子が読んでいる本を次郎に渡した')
太郎 名詞,固有名詞,人名,名,*,*,太郎,タロウ,タロー
は 助詞,係助詞,*,*,*,*,は,ハ,ワ
* 1 2D 0/1 1.700175
花子 名詞,固有名詞,人名,名,*,*,花子,ハナコ,ハナコ
が 助詞,格助詞,一般,*,*,*,が,ガ,ガ
* 2 3D 0/2 1.825021
読ん 動詞,自立,*,*,五段・マ行,連用タ接続,読む,ヨン,ヨン
で 助詞,接続助詞,*,*,*,*,で,デ,デ
いる 動詞,非自立,*,*,一段,基本形,いる,イル,イル
* 3 5D 0/1 -0.742128
本 名詞,一般,*,*,*,*,本,ホン,ホン
を 助詞,格助詞,一般,*,*,*,を,ヲ,ヲ
* 4 5D 1/2 -0.742128
次 名詞,一般,*,*,*,*,次,ツギ,ツギ
郎 名詞,一般,*,*,*,*,郎,ロウ,ロー
に 助詞,格助詞,一般,*,*,*,に,ニ,ニ
* 5 -1D 0/1 0.000000
渡し 動詞,自立,*,*,五段・サ行,連用形,渡す,ワタシ,ワタシ
た 助動詞,*,*,*,特殊・タ,基本形,た,タ,タ
EOS
=> nil
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
# File 'lib/nimono/nimono.rb', line 89 def initialize(={}) @options = self.class.() opt_str = self.class.(@options) @libpath = self.class.cabocha_library @parser = self.class.cabocha_new2(opt_str) if @parser.address == 0x0 raise CabochaError.new("Could not initialize CaboCha with options: '#{opt_str}'") end @tree = self.class.cabocha_sparse_totree(@parser, "") if @options[:output_layer] self.class.cabocha_tree_set_output_layer(@tree, @options[:output_layer]) end @sparse_tostr = ->(text) { begin self.class.cabocha_sparse_tostr(@parser, text).force_encoding(Encoding.default_external) rescue raise CabochaError.new 'Parse Error' end } end |
Instance Attribute Details
#chunks ⇒ Array (readonly)
Returns Array of chunk.
20 21 22 |
# File 'lib/nimono/nimono.rb', line 20 def chunks @chunks end |
#libpath ⇒ String (readonly)
Returns absolute file path to CaboCha library.
18 19 20 |
# File 'lib/nimono/nimono.rb', line 18 def libpath @libpath end |
#options ⇒ Hash (readonly)
Returns CaboCha options as Key-Value pairs.
16 17 18 |
# File 'lib/nimono/nimono.rb', line 16 def @options end |
#tokens ⇒ Array (readonly)
Returns Array of Token.
23 24 25 |
# File 'lib/nimono/nimono.rb', line 23 def tokens @tokens end |
Instance Method Details
#parse(text) ⇒ String
Parses the given ‘text`, returning the CaboCha output as a string. At the same time creating #chunks and #tokens.
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
# File 'lib/nimono/nimono.rb', line 119 def parse(text) if text.nil? raise CabochaError.new 'Text to parse cannot be nil' else @result = @sparse_tostr.call(text) @tree = self.class.cabocha_sparse_totree(@parser, text) @tokens = [] self.class.cabocha_tree_token_size(@tree).times do |i| @tokens << Nimono::Token.new(self.class.cabocha_tree_token(@tree, i)) end @tokens.freeze @chunks = [] self.class.cabocha_tree_chunk_size(@tree).times do |i| @chunks << Nimono::Chunk.new(self.class.cabocha_tree_chunk(@tree, i)) # chunk = Nimono::Chunk.new(self.class.cabocha_tree_chunk(@tree, i)) # chunk.instance_variable_set(:@tokens, @tokens[chunk.token_pos..(chunk.token_pos + chunk.token_size - 1)]) # @chunks << chunk end @chunks.freeze self.to_s end end |
#to_s ⇒ String
The result of parsing Japanese text
147 148 149 |
# File 'lib/nimono/nimono.rb', line 147 def to_s @result end |