Class: ArDumper

Inherits:
Object
  • Object
show all
Defined in:
lib/ar_dumper_base.rb

Overview

Formats ActiveRecord data in chunks and dumps it to a file, temporary file, or string. Specify the page_size used to paginate records and flush files.

Specify Output

  • :filename the name of the file to create. By default will create a file based on the timestamp.

  • :file_extension appends file extension unless one exists in file name

  • :only a list of the attributes to be included. By default, all column_names are used.

  • :except a list of the attributes to be excluded. By default, all column_names are used. This option is not available if :only is used

  • :methods a list of the methods to be called on the object

  • :procs hash of header name to Proc object

Attributes (Only and Exclude)

Specify which attributes to include and exclude

Book.dumper :yml, :only => [:author_name, :title]

Book.dump :csv, :except => [:topic_id]

Methods

Use :methods to include methods on the record that are not column attributes

BigOle.dumper :csv, :methods => [:age, :favorite_food]
Output..
..other attributes.., 25, doughnuts

Proc

To call procs on the object(s) use :procs with a hash of name to value The dumper options hash are provided to the proc, and contains the current record options[:record] is provided

Proc Options Hash

:record - the active record :result_set - the current result set :counter - the number of the record :page_num - the page number :target - the file/string target

topic_content_proc = Proc.new{|options|  options[:record].topic ? options[:record].topic.content : 'NO CONTENT' }
Book.dumper :procs => {:topic_content => topic_content_proc}})

 <book>
   # ... other attributes and methods ...
   <topic-content>NO CONTENT</my_rating>
 </book>

Finder Methods

  • :find a map of the finder options passed to find. For example, {:conditions => ['hairy = ?', 'of course'], :include => :rodents}

  • :records - the records to be dumped instead of using a find

Format Header

  • :header when a hash is specified, maps the field name to the header name. For example {:a => 'COL A', :b => 'COL B'} would print ‘COL A’, ‘COL B’ when an array is specified uses this instead of the fields when true or by default prints the fields when false does not include a header

  • :text_format a string method such as :titleize, :dasherize, :underscore to format the on all the headers. If an attribute is :email_address and :titleize is chosen, then the Header value is “Email Address”

  • :root In xml, this is the name of the highest level list object. The plural of the class name is the default. For yml, this is the base name of the the objects. Each record will be root_id. For example, contact_2348

Filename and Target

:target_type The target_type for the data. Defaults to :file.

  • :string prints to string. Do not use with large data sets

  • :tmp_file. Use a temporary file that is destroyed when the process exists

  • :file. Use a standard file

:filename basename of the file. Defaults to random time based string for non-temporary files :file_extension Extension (suffix) like .csv, .xml. Added only if the basename has no suffix. :file_extension is only available when :target_type_type => :file :file_path path or directory of the file. Defaults to dumper_file_path or temporary directories

Format specific options

  • :csv - any options to pass to csv parser. Example :csv => { :col_sep => "\t" }

  • :xml - any options to pass to xml parser. Example :xml => { :indent => 4 }

Installation

script/plugin install git://github.com/blythedunham/ar_dumper.git

Developers

Blythe Dunham http://snowgiraffe.com

Homepage

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(klass, dump_options = {}) ⇒ ArDumper

:nodoc:



104
105
106
107
108
109
110
111
112
# File 'lib/ar_dumper_base.rb', line 104

def initialize(klass, dump_options={})#:nodoc:

  @klass = klass
  @options = dump_options
  build_attribute_list
  
  unless options[:text_format].nil? || String.new.respond_to?(options[:text_format])
    raise ArDumperException.new("Invalid value for option :text_format #{options[:text_format]}")
  end
end

Instance Attribute Details

#fieldsObject (readonly)

Returns the value of attribute fields.



100
101
102
# File 'lib/ar_dumper_base.rb', line 100

def fields
  @fields
end

#klassObject (readonly)

Returns the value of attribute klass.



101
102
103
# File 'lib/ar_dumper_base.rb', line 101

def klass
  @klass
end

#optionsObject (readonly)

Returns the value of attribute options.



102
103
104
# File 'lib/ar_dumper_base.rb', line 102

def options
  @options
end

Class Method Details

.compute_page_size(max_records, page_num, page_size) ⇒ Object

:nodoc:



480
481
482
# File 'lib/ar_dumper_base.rb', line 480

def self.compute_page_size(max_records, page_num, page_size)#:nodoc:

  max_records ? [(max_records - (page_num * page_size)), page_size].min : page_size
end

.paginate_dump_records(klass, options = {}, &block) ⇒ Object

Pagination Helpers (Support before 2.3.2)

Quick and dirty paginate to loop thru the records page by page Options are:

  • :find - a map of the finder options passed to find. For example, + => [‘hairy = ?’, ‘of course’], :include => :rodents +

  • :page_size - the page size to use. Defaults to dumper_page_size or 50. Set to false to disable pagination

  • :records - the records to be dumped instead of using a find



450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
# File 'lib/ar_dumper_base.rb', line 450

def self.paginate_dump_records(klass, options={}, &block)#:nodoc:

  finder_options = (options[:find]||{}).clone
  
  if options[:records]
    yield options[:records], 0
    return
  #pagination is not needed when :page_size => false

  elsif options[:page_size].is_a?(FalseClass)
    yield klass.find(:all, finder_options), 0
    return
  end
  
  options[:page_size]||= dumper_page_size
  
  #limit becomes the maximum amount of records to pull

  max_records = finder_options[:limit]
  page_num = 0
  finder_options[:limit] = compute_page_size(max_records, page_num, options[:page_size])
  records = []
  while (finder_options[:limit] > 0 && (page_num == 0 || records.length == options[:page_size]))      
    records = klass.find :all, finder_options.update(:offset => page_num * options[:page_size])
    
    yield records, page_num
    page_num = page_num + 1
    
    #calculate the limit if an original limit (max_records) was set

    finder_options[:limit] = compute_page_size(max_records, page_num, options[:page_size])
  end
end

.paginate_each_record(klass, options = {}, &block) ⇒ Object

Quick and dirty paginate to loop thru each page Options are:

  • :find - a map of the finder options passed to find. For example, + => [‘hairy = ?’, ‘of course’], :include => :rodents +

  • :page_size - the page size to use. Defaults to dumper_page_size or 50. Set to false to disable pagination



488
489
490
491
492
493
494
495
# File 'lib/ar_dumper_base.rb', line 488

def self.paginate_each_record(klass, options={}, &block)#:nodoc:

  counter = -1
  paginate_dump_records(klass, options) do |records, page_num|
    records.each do |record| 
      yield record, (counter +=1)
    end
  end
end

Instance Method Details

#build_attribute_listObject

build a list of attributes, methods and procs



115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
# File 'lib/ar_dumper_base.rb', line 115

def build_attribute_list#:nodoc:

  if options[:only]
    options[:attributes] = Array(options[:only])
  else
    options[:attributes] = @klass.column_names - Array(options[:except]).collect { |e| e.to_s }
  end
    
  options[:attributes] = options[:attributes].collect{|attr| "#{attr}"}  
  options[:methods] = options[:methods].is_a?(Hash) ? options[:methods].values : Array(options[:methods])
  
  #if procs are specified as an array separate the headers(keys) from the procs(values)

  if options[:procs].is_a?(Hash)
    options[:proc_headers]= options[:procs].keys
    options[:procs]= options[:procs].values
  else
    options[:procs] = Array(options[:procs])
    options[:proc_headers]||= Array.new
    0.upto(options[:procs].size - options[:proc_headers].size - 1) {|idx| options[:proc_headers] << "proc_#{idx}" }
  end
  
end

#build_header_listObject

Returns an array with the header names This will be in the same order as the data returned by dump_record attributes + methods + procs

:header The header defaults to the attributes and method names. When set

to false no header is specified
 * +hash+ A map from attribute or method name to Header column name
 * +array+ A list in the same order that is used to display record data

:procs If a hash, then the keys are the names. If an array, then use proc_1, proc_2, etc :text_format Format names with a text format such as :titlieze, :dasherize, :underscore



354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
# File 'lib/ar_dumper_base.rb', line 354

def build_header_list#:nodoc:


  header_options = options[:header]
  columns = @options[:attributes] + @options[:methods]
  header_names = 
    if header_options.is_a?(Hash)
      header_options.symbolize_keys!
      
      #Get the header for each attribute and method

      columns.collect{|field|(header_options[field.to_sym]||field).to_s}
      
    #ordered by attributes, methods, then procs

    elsif header_options.is_a?(Array)
      header_names = header_options
      header_names.concat(columns[header_options.length..-1]) if header_names.length < columns.length
      
    #default to column names 

    else
      columns
    end
  
  #add process names

  header_names.concat(options[:proc_headers])
  
  #format names with a text format such as titlieze, dasherize, underscore

  header_names.collect!{|n| n.to_s.send(options[:text_format])} if options[:text_format]
  
  header_names
end

#csv_writerObject

Try to use the FasterCSV if it exists otherwise use csv



330
331
332
333
334
335
336
337
338
339
340
341
# File 'lib/ar_dumper_base.rb', line 330

def csv_writer #:nodoc:

  unless @@csv_writer
    @@csv_writer = :faster
    begin 
      require 'faster_csv'#:nodoc:

      ::FasterCSV
    rescue Exception => exc
      @@csv_writer = :normal
    end
  end
  @@csv_writer
end

#dump(format) ⇒ Object

Dump to the appropriate format



138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
# File 'lib/ar_dumper_base.rb', line 138

def dump(format)#:nodoc:


  case format.to_sym
    when :csv
      dump_to_csv
      
    when :xml
      dump_to_xml
      
    when :yaml, :fixture, :yml
      dump_to_fixture
      
    else
      raise ArDumperException.new("Unknown format #{format}. Please specify :csv, :xml, or :yml ")
  end
end

#dump_record(record) ⇒ Object

collect the record data into an array



189
190
191
192
193
194
# File 'lib/ar_dumper_base.rb', line 189

def dump_record(record)#:nodoc:

  record_values = @options[:attributes].inject([]){|values, attr| values << record["#{attr}"]; values }
  record_values = @options[:methods].inject(record_values) {|values, method| values << record.send(method); values }
  record_values = @options[:procs].inject(record_values){|values, proc| values << proc.call(options); values }
  record_values
end

#dump_to_csvObject

CSV DUMPER

Dump csv data

  • :csv - any options to pass to csv parser. :col_sep Example + :csv => => “t” + :row_sep Row seperator

  • :page_size - the page size to use. Defaults to dumper_page_size or 50



304
305
306
307
308
309
310
311
312
313
314
315
316
317
# File 'lib/ar_dumper_base.rb', line 304

def dump_to_csv#:nodoc:

  header = nil
  @options[:csv]||={}
  
  if !@options[:header].is_a?(FalseClass)
    header_list = build_header_list
    #print the header unless set to false

    header = write_csv_row(header_list)
  end
  
  dumper(:csv, header) do |record|
    options[:target] << write_csv_row(dump_record(record))
  end
end

#dump_to_fixtureObject

Yaml/Fixture Dumper

dumps the data to a fixture file In addition to options listed in dumper: :root Basename of the record. Defaults to the class name so each record is named customer_1



279
280
281
282
283
284
285
286
287
288
289
290
# File 'lib/ar_dumper_base.rb', line 279

def dump_to_fixture#:nodoc:

  basename = @options[:root]||@klass.table_name.singularize
  header_list = build_header_list

  # doctor the yaml a bit to print the hash header at the top

  # instead of each record

  dumper(:yml, "---\s") do |record|
    record_data = Hash.new
    dump_record(record).each_with_index{|field, idx| record_data[header_list[idx].to_s] = field.to_s }
    options[:target] << {"#{basename}_#{record.id}" => record_data}.to_yaml.gsub(/^---\s\n/, "\n")
  end
end

#dump_to_xmlObject

XML Dumper

Dumps the data to an xml file

Using the ActiveRecord version of dumper so we CANNOT specify fields that are not attributes

In addition to options listed in dumper: :xml - xml options for the xml_serializer. Includes :indent, :skip_instruct, :margin

Note that :procs will use the dumper proc and pass the dumper options

Book.dump :xml, :procs => {:topic_content => Proc.new { |options| options[:record].topic.content }}

To use xml proc, specify :xml => {:procs => array_of_procs}

Book.dump :xml, :xml => {:procs => [Proc.new{|xml_options| xml_options[:builder].tag 'abc', 'def'}]}


212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
# File 'lib/ar_dumper_base.rb', line 212

def dump_to_xml#:nodoc:


  #preserve the original skip instruct

  skip_instruct = @options[:xml] && @options[:xml][:skip_instruct].is_a?(TrueClass)

  self.options[:procs]||= []

  #use the fields if :only is not specified in the xml options

  xml_options = {
    :only => @options[:only],
    :except => @options[:except],
    :methods => @options[:methods]
  }
  
  xml_options.update(@options[:xml]) if @options[:xml]

  #do not instruct for each set

  xml_options[:skip_instruct] = true
  xml_options[:indent]||=2
  xml_options[:margin] = xml_options[:margin].to_i + 1
  
  #set the variable on the options

  options[:xml] = xml_options
  
  #builder for header and footer

  builder_options = {
    :margin => xml_options[:margin] - 1,
    :indent => xml_options[:indent]
  }
  
  options[:root] = (options[:root] || @klass.to_s.underscore.pluralize).to_s

  #use the builder to make sure we are indented properly

  builder = Builder::XmlMarkup.new(builder_options.clone)
  builder.instruct! unless skip_instruct
  builder << "<#{options[:root]}>\n"
  header = builder.target!
  
  #get the footer. Using the builder will make sure we are indented properly

  builder = Builder::XmlMarkup.new(builder_options)
  builder << "</#{options[:root]}>"
  footer = builder.target!

  dumper(:xml, header, footer) do |record|
    options[:target] << serialize_record_dump_xml(record, xml_options)
  end
end

#dumper(file_extension = nil, header = nil, footer = nil, &block) ⇒ Object

Wrapper around the dump. The main dump functionality



156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
# File 'lib/ar_dumper_base.rb', line 156

def dumper(file_extension=nil, header = nil, footer = nil, &block)#:nodoc:

  
  options[:counter] = -1
  begin
    #get the file parameters

    target = prepare_target(file_extension)
    target << header if header
    ArDumper.paginate_dump_records(@klass, @options) do |records, page_num|
    
      #save state on options to make it accessible by

      #class and procs

      options[:result_set] = records
      options[:page_num] = page_num
      
      records.each do |record|
        options[:record] = record
        yield record
      end
      
      #flush after each set

      target.flush if target.respond_to?(:flush)
    end
    target << footer if footer
    
  #final step close the options[:target]

  ensure
    target.close if target && target.respond_to?(:close)
  end

  options[:full_file_name]||target
end

#prepare_target(file_extension = nil) ⇒ Object

Create the options(file) based on these options. The options must respond to << Current optionss are :string, :file, :tempfile

:target_type The options for the data. Defaults to :file

* :string prints to string. Do not use with large data sets
* :tmp_file. Use a temporary file that is destroyed when the process exists
* :file. Use a standard file

:filename basename of the file. Defaults to random time based string for non-temporary files :file_extension Extension (suffix) like .csv, .xml. Added only if the basename has no suffix.

:file_extension is only available when +:target_type_type => :file+

:file_path path or directory of the file. Defaults to dumper_file_path or temporary directories



400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
# File 'lib/ar_dumper_base.rb', line 400

def prepare_target(file_extension = nil)#:nodoc:

  
  options[:target] = case options[:target_type]
    #to string option dumps to a string instead of a file

    when :string
      String.new
      
    #use a temporary file

    #open a temporary file with the basename specified by filename

    #defaults to the value of one of the environment variables TMPDIR, TMP, or TEMP

    when :tmp_file

      Tempfile.open(options[:filename]||(@@dumper_tmp_file_basename+@klass.name.downcase), 
                    options[:file_path]||@@dumper_file_path)
                    
    #default to a real file                

    else
      extension = options[:file_extension]||file_extension
      mode = options[:append_to_file].is_a?(TrueClass)? 'a' : 'w'
      filename = options[:filename]||"#{@@dumper_tmp_file_basename}.#{@klass.name.downcase}.#{Time.now.to_f.to_s}.#{extension}"
      
      #append an extension unless one already exists

      filename += ".#{extension}" if extension && !filename =~ /\.\w*$/
      
      #get the file path if the filename does not contain one

      if File.basename(filename) == filename
        path = options[:file_path]||@@dumper_file_path 
        filename = File.join(path, filename) unless path.blank?
      end

      
      File.open(filename, mode)
   end
   
  options[:full_file_name] = options[:target].path if options[:target].respond_to?(:path)
  options[:target]
end

#serialize_record_dump_xml(record, xml_options) ⇒ Object

Serialize the xml data for the given record



261
262
263
264
265
266
267
268
269
270
# File 'lib/ar_dumper_base.rb', line 261

def serialize_record_dump_xml(record, xml_options)#:nodoc:


  serializer = ActiveRecord::XmlSerializer.new(record, xml_options.dup)
  xml = serializer.to_s do |builder|
    self.options[:procs].each_with_index do |proc, idx|
      serializer.add_tag_for_value(self.options[:proc_headers][idx].to_s,
                                   proc.call(self.options))
    end
  end
end

#write_csv_row(row_data, header_list = []) ⇒ Object

Write out the csv row using the selected csv writer



320
321
322
323
324
325
326
# File 'lib/ar_dumper_base.rb', line 320

def write_csv_row(row_data, header_list=[])#:nodoc: 

  if csv_writer == :faster
    ::FasterCSV::Row.new(header_list, row_data).to_csv(@options[:csv])
  else
    ::CSV.generate_line(row_data, @options[:csv][:col_sep], @options[:csv][:row_sep]) + (@options[:csv][:row_sep]||"\n")
  end
end