Class: SimpleXlsxReader::Loader::SharedStringsParser

Inherits:
Nokogiri::XML::SAX::Document
  • Object
show all
Defined in:
lib/simple_xlsx_reader/loader/shared_strings_parser.rb

Overview

For performance reasons, excel uses an optional SpreadsheetML feature that puts all strings in a separate xml file, and then references them by their index in that file.

msdn.microsoft.com/en-us/library/office/gg278314.aspx

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeSharedStringsParser

Returns a new instance of SharedStringsParser.



17
18
19
20
21
# File 'lib/simple_xlsx_reader/loader/shared_strings_parser.rb', line 17

def initialize
  @result = []
  @composite = false
  @extract = false
end

Instance Attribute Details

#resultObject (readonly)

Returns the value of attribute result.



23
24
25
# File 'lib/simple_xlsx_reader/loader/shared_strings_parser.rb', line 23

def result
  @result
end

Class Method Details

.parse(file) ⇒ Object



11
12
13
14
15
# File 'lib/simple_xlsx_reader/loader/shared_strings_parser.rb', line 11

def self.parse(file)
  new.tap do |parser|
    Nokogiri::XML::SAX::Parser.new(parser).parse(file)
  end.result
end

Instance Method Details

#characters(string) ⇒ Object



32
33
34
35
36
# File 'lib/simple_xlsx_reader/loader/shared_strings_parser.rb', line 32

def characters(string)
  return unless @extract

  @current_string << string
end

#end_element(name) ⇒ Object



38
39
40
41
42
43
# File 'lib/simple_xlsx_reader/loader/shared_strings_parser.rb', line 38

def end_element(name)
  case name
  when 't' then @extract = false
  when 'si' then @result << @current_string
  end
end

#start_element(name, _attrs = []) ⇒ Object



25
26
27
28
29
30
# File 'lib/simple_xlsx_reader/loader/shared_strings_parser.rb', line 25

def start_element(name, _attrs = [])
  case name
  when 'si' then @current_string = +"" # UTF-8 variant of String.new
  when 't' then @extract = true
  end
end