Class: RStore::CSV
Instance Attribute Summary collapse
-
#data_array ⇒ Array<Data>
readonly
Holds
RStore::Dataobjects that are used internally to store information from a data source. -
#database ⇒ BaseDB
readonly
A subclass of BaseDB.
-
#table ⇒ BaseTable
readonly
A sublcass of BaseTable.
Class Method Summary collapse
-
.change_default_options(options) ⇒ void
Change default options recognized by #from The new option values apply to all following instances of
RStore::CSVOptions can be reset to their defaults by calling CSV.reset_default_options See #from for a list of all options and their default values. - .database_table(db_table) ⇒ Object
- .delimiter_correct?(name) ⇒ Boolean
-
.query(db_table) {|table| ... } ⇒ void
Easy querying by yielding a Sequel::Dataset instance of your table.
-
.reset_default_options ⇒ void
Reset the options recognized by #from to their default values.
Instance Method Summary collapse
- #create_table(db) ⇒ Object
-
#from(source, options = {}) ⇒ void
Specify the source of the csv file(s) There can be several calls to this method on given instance of
RStore::CSV. -
#initialize(&block) ⇒ CSV
constructor
This constructor takes a block yielding an implicit instance of self.
-
#ran_once? ⇒ Boolean
Test if the data has been inserted into the database table.
- #read_data(data_object) ⇒ Object
-
#run ⇒ void
Start processing the csv files, storing the data into a database table.
-
#to(db_table) ⇒ void
Choose the database table to store the csv data into.
Constructor Details
#initialize(&block) ⇒ CSV
37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# File 'lib/rstore/csv.rb', line 37 def initialize &block @data_hash = {} @data_array = [] @database = nil @table = nil # Tracking method calls to #from, #to, and #run. @from = false @to = false @run = false instance_eval(&block) if block_given? end |
Instance Attribute Details
#data_array ⇒ Array<Data> (readonly)
Returns holds RStore::Data objects that are used internally to store information from a data source.
20 21 22 |
# File 'lib/rstore/csv.rb', line 20 def data_array @data_array end |
Class Method Details
.change_default_options(options) ⇒ void
This method returns an undefined value.
Change default options recognized by #from The new option values apply to all following instances of RStore::CSV Options can be reset to their defaults by calling reset_default_options See #from for a list of all options and their default values.
275 276 277 |
# File 'lib/rstore/csv.rb', line 275 def self. Configuration.() end |
.database_table(db_table) ⇒ Object
114 115 116 117 118 119 120 121 122 123 124 125 126 |
# File 'lib/rstore/csv.rb', line 114 def self.database_table db_table raise ArgumentError, "The name of the database and table have to be separated with a dot (.)" unless delimiter_correct?(db_table) db, tb = db_table.split('.') database = BaseDB.db_classes[db.downcase.to_sym] table = BaseTable.table_classes[tb.downcase.to_sym] raise Exception, "Database '#{db}' not found" if database.nil? raise Exception, "Table '#{tb}' not found" if table.nil? [database, table] end |
.delimiter_correct?(name) ⇒ Boolean
255 256 257 |
# File 'lib/rstore/csv.rb', line 255 def self.delimiter_correct? name !!(name =~ /^[^\.]+\.[^\.]+$/) end |
.query(db_table) {|table| ... } ⇒ void
This method returns an undefined value.
Easy querying by yielding a Sequel::Dataset instance of your table.
245 246 247 248 249 250 |
# File 'lib/rstore/csv.rb', line 245 def self.query db_table, &block database, table = database_table(db_table) database.connect do |db| block.call(db[table.name]) if block_given? # Sequel::Dataset end end |
.reset_default_options ⇒ void
This method returns an undefined value.
Reset the options recognized by #from to their default values.
285 286 287 |
# File 'lib/rstore/csv.rb', line 285 def self. Configuration. end |
Instance Method Details
#create_table(db) ⇒ Object
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
# File 'lib/rstore/csv.rb', line 214 def create_table db name = @table.name if @database.connection_info.is_a?(Hash) if @database.connection_info[:adapter] == 'mysql' # http://sequel.rubyforge.org/rdoc/files/doc/release_notes/2_10_0_txt.html Sequel::MySQL.default_engine = 'InnoDB' # http://stackoverflow.com/questions/1671401/unable-to-output-mysql-tables-which-involve-dates-in-sequel Sequel::MySQL.convert_invalid_date_time = nil end end unless db.table_exists?(name) db.create_table(name, &@table.table_info) end end |
#from(source, options) ⇒ void #from(source) ⇒ void
This method returns an undefined value.
Specify the source of the csv file(s) There can be several calls to this method on given instance of RStore::CSV. This method has to be called before #run.
89 90 91 92 93 |
# File 'lib/rstore/csv.rb', line 89 def from source, ={} crawler = FileCrawler.new(source, :csv, ) @data_hash.merge!(crawler.data_hash) @from = true end |
#ran_once? ⇒ Boolean
Test if the data has been inserted into the database table.
261 262 263 |
# File 'lib/rstore/csv.rb', line 261 def ran_once? @run == true end |
#read_data(data_object) ⇒ Object
183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
# File 'lib/rstore/csv.rb', line 183 def read_data data_object path = data_object.path = data_object. begin if path.url? require 'nokogiri' doc = Nokogiri::HTML(open(path)) selector = [:file_options][:selector] content = doc.css(selector).inject("") do |result, link| result << link.content << "\n" result end else content = File.read(path) end raise ArgumentError, "Empty content!" if content.empty? rescue Exception => e logger = Logger.new(data_object) logger.log(:fetch, e) logger.error end content end |
#run ⇒ void
132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
# File 'lib/rstore/csv.rb', line 132 def run return if ran_once? # Ignore subsequent calls to #run raise Exception, "At least one method 'from' has to be called before method 'run'" unless @from == true raise Exception, "Method 'to' has to be called before method 'run'" unless @to == true @data_hash.each do |path, data| content = read_data(data) @data_array << Data.new(path, content, :raw, data.) end @database.connect do |db| create_table(db) name = @table.name prepared_data_array = @data_array.map do |data| data.parse_csv.convert_fields(db, name) end insert_all(prepared_data_array, db, name) @run = true = <<-TEXT.gsub(/^\s+/, '') =============================== All data has been successfully inserted into table '#{database.name}.#{table.name}'" ------------------------------- You can retrieve all table data with the following code: ------------------------------- #{self.class}.query('#{database.name}.#{table.name}') do |table| table.all end =============================== TEXT puts end end |
#to(db_table) ⇒ void
This method returns an undefined value.
Choose the database table to store the csv data into. This method has to be called before #run.
107 108 109 110 |
# File 'lib/rstore/csv.rb', line 107 def to db_table @database, @table = CSV.database_table(db_table) @to = true end |