Class: ArDumper

Inherits:

Object

Object
ArDumper

show all

Defined in:: lib/ar_dumper_base.rb

Overview

Formats ActiveRecord data in chunks and dumps it to a file, temporary file, or string. Specify the page_size used to paginate records and flush files.

Specify Output

:filename the name of the file to create. By default will create a file based on the timestamp.
:file_extension appends file extension unless one exists in file name
:only a list of the attributes to be included. By default, all column_names are used.
:except a list of the attributes to be excluded. By default, all column_names are used. This option is not available if :only is used
:methods a list of the methods to be called on the object
:procs hash of header name to Proc object

Attributes (Only and Exclude)

Specify which attributes to include and exclude

Book.dumper :yml, :only => [:author_name, :title]

Book.dump :csv, :except => [:topic_id]

Methods

Use :methods to include methods on the record that are not column attributes

BigOle.dumper :csv, :methods => [:age, :favorite_food]
Output..
..other attributes.., 25, doughnuts

Proc

To call procs on the object(s) use :procs with a hash of name to value The dumper options hash are provided to the proc, and contains the current record options[:record] is provided

Proc Options Hash

:record - the active record :result_set - the current result set :counter - the number of the record :page_num - the page number :target - the file/string target

topic_content_proc = Proc.new{|options|  options[:record].topic ? options[:record].topic.content : 'NO CONTENT' }
Book.dumper :procs => {:topic_content => topic_content_proc}})

 <book>
   # ... other attributes and methods ...
   <topic-content>NO CONTENT</my_rating>
 </book>

Finder Methods

:find a map of the finder options passed to find. For example, {:conditions => ['hairy = ?', 'of course'], :include => :rodents}
:records - the records to be dumped instead of using a find

:header when a hash is specified, maps the field name to the header name. For example {:a => 'COL A', :b => 'COL B'} would print ‘COL A’, ‘COL B’ when an array is specified uses this instead of the fields when true or by default prints the fields when false does not include a header
:text_format a string method such as :titleize, :dasherize, :underscore to format the on all the headers. If an attribute is :email_address and :titleize is chosen, then the Header value is “Email Address”
:root In xml, this is the name of the highest level list object. The plural of the class name is the default. For yml, this is the base name of the the objects. Each record will be root_id. For example, contact_2348

Filename and Target

:target_type The target_type for the data. Defaults to :file.

:string prints to string. Do not use with large data sets
:tmp_file. Use a temporary file that is destroyed when the process exists
:file. Use a standard file

:filename basename of the file. Defaults to random time based string for non-temporary files :file_extension Extension (suffix) like .csv, .xml. Added only if the basename has no suffix. :file_extension is only available when :target_type_type => :file :file_path path or directory of the file. Defaults to dumper_file_path or temporary directories

Format specific options

:csv - any options to pass to csv parser. Example :csv => { :col_sep => "\t" }
:xml - any options to pass to xml parser. Example :xml => { :indent => 4 }

Installation

script/plugin install git://github.com/blythedunham/ar_dumper.git

Developers

Blythe Dunham http://snowgiraffe.com

Homepage

Project Site: github.com/blythedunham/ar_dumper/tree/master
Rdoc: snowgiraffe.com/rdocs/ar_dumper

Instance Attribute Summary collapse

#fields ⇒ Object readonly

Returns the value of attribute fields.
#klass ⇒ Object readonly

Returns the value of attribute klass.
#options ⇒ Object readonly

Returns the value of attribute options.

Class Method Summary collapse

.compute_page_size(max_records, page_num, page_size) ⇒ Object

:nodoc:.
.paginate_dump_records(klass, options = {}, &block) ⇒ Object

Pagination Helpers (Support before 2.3.2).
.paginate_each_record(klass, options = {}, &block) ⇒ Object

Quick and dirty paginate to loop thru each page Options are: * :find - a map of the finder options passed to find.

Instance Method Summary collapse

#build_attribute_list ⇒ Object

build a list of attributes, methods and procs.
#build_header_list ⇒ Object

Returns an array with the header names This will be in the same order as the data returned by dump_record attributes + methods + procs.
#csv_writer ⇒ Object

Try to use the FasterCSV if it exists otherwise use csv.
#dump(format) ⇒ Object

Dump to the appropriate format.
#dump_record(record) ⇒ Object

collect the record data into an array.
#dump_to_csv ⇒ Object

CSV DUMPER.
#dump_to_fixture ⇒ Object

Yaml/Fixture Dumper.
#dump_to_xml ⇒ Object

XML Dumper.
#dumper(file_extension = nil, header = nil, footer = nil, &block) ⇒ Object

Wrapper around the dump.
#initialize(klass, dump_options = {}) ⇒ ArDumper constructor

:nodoc:.
#prepare_target(file_extension = nil) ⇒ Object

Create the options(file) based on these options.
#serialize_record_dump_xml(record, xml_options) ⇒ Object

Serialize the xml data for the given record.
#write_csv_row(row_data, header_list = []) ⇒ Object

Write out the csv row using the selected csv writer.

Constructor Details

#initialize(klass, dump_options = {}) ⇒ `ArDumper`

:nodoc:

# File 'lib/ar_dumper_base.rb', line 104

def initialize(klass, dump_options={})#:nodoc:

  @klass = klass
  @options = dump_options
  build_attribute_list
  
  unless options[:text_format].nil? || String.new.respond_to?(options[:text_format])
    raise ArDumperException.new("Invalid value for option :text_format #{options[:text_format]}")
  end
end

Instance Attribute Details

#fields ⇒ `Object` (readonly)

Returns the value of attribute fields.



100
101
102

# File 'lib/ar_dumper_base.rb', line 100

def fields
  @fields
end

#klass ⇒ `Object` (readonly)

Returns the value of attribute klass.



101
102
103

# File 'lib/ar_dumper_base.rb', line 101

def klass
  @klass
end

#options ⇒ `Object` (readonly)

Returns the value of attribute options.



102
103
104

# File 'lib/ar_dumper_base.rb', line 102

def options
  @options
end

Class Method Details

.compute_page_size(max_records, page_num, page_size) ⇒ `Object`

:nodoc:



480
481
482

# File 'lib/ar_dumper_base.rb', line 480

def self.compute_page_size(max_records, page_num, page_size)#:nodoc:

  max_records ? [(max_records - (page_num * page_size)), page_size].min : page_size
end

.paginate_dump_records(klass, options = {}, &block) ⇒ `Object`

Pagination Helpers (Support before 2.3.2)

Quick and dirty paginate to loop thru the records page by page Options are:

:find - a map of the finder options passed to find. For example, + => [‘hairy = ?’, ‘of course’], :include => :rodents +
:page_size - the page size to use. Defaults to dumper_page_size or 50. Set to false to disable pagination
:records - the records to be dumped instead of using a find

# File 'lib/ar_dumper_base.rb', line 450

def self.paginate_dump_records(klass, options={}, &block)#:nodoc:

  finder_options = (options[:find]||{}).clone
  
  if options[:records]
    yield options[:records], 0
    return
  #pagination is not needed when :page_size => false

  elsif options[:page_size].is_a?(FalseClass)
    yield klass.find(:all, finder_options), 0
    return
  end
  
  options[:page_size]||= dumper_page_size
  
  #limit becomes the maximum amount of records to pull

  max_records = finder_options[:limit]
  page_num = 0
  finder_options[:limit] = compute_page_size(max_records, page_num, options[:page_size])
  records = []
  while (finder_options[:limit] > 0 && (page_num == 0 || records.length == options[:page_size]))      
    records = klass.find :all, finder_options.update(:offset => page_num * options[:page_size])
    
    yield records, page_num
    page_num = page_num + 1
    
    #calculate the limit if an original limit (max_records) was set

    finder_options[:limit] = compute_page_size(max_records, page_num, options[:page_size])
  end
end

.paginate_each_record(klass, options = {}, &block) ⇒ `Object`

Quick and dirty paginate to loop thru each page Options are:

:find - a map of the finder options passed to find. For example, + => [‘hairy = ?’, ‘of course’], :include => :rodents +
:page_size - the page size to use. Defaults to dumper_page_size or 50. Set to false to disable pagination

# File 'lib/ar_dumper_base.rb', line 488

def self.paginate_each_record(klass, options={}, &block)#:nodoc:

  counter = -1
  paginate_dump_records(klass, options) do |records, page_num|
    records.each do |record| 
      yield record, (counter +=1)
    end
  end
end

Instance Method Details

#build_attribute_list ⇒ `Object`

build a list of attributes, methods and procs

# File 'lib/ar_dumper_base.rb', line 115

def build_attribute_list#:nodoc:

  if options[:only]
    options[:attributes] = Array(options[:only])
  else
    options[:attributes] = @klass.column_names - Array(options[:except]).collect { |e| e.to_s }
  end
    
  options[:attributes] = options[:attributes].collect{|attr| "#{attr}"}  
  options[:methods] = options[:methods].is_a?(Hash) ? options[:methods].values : Array(options[:methods])
  
  #if procs are specified as an array separate the headers(keys) from the procs(values)

  if options[:procs].is_a?(Hash)
    options[:proc_headers]= options[:procs].keys
    options[:procs]= options[:procs].values
  else
    options[:procs] = Array(options[:procs])
    options[:proc_headers]||= Array.new
    0.upto(options[:procs].size - options[:proc_headers].size - 1) {|idx| options[:proc_headers] << "proc_#{idx}" }
  end
  
end

#build_header_list ⇒ `Object`

Returns an array with the header names This will be in the same order as the data returned by dump_record attributes + methods + procs

:header The header defaults to the attributes and method names. When set

to false no header is specified
 * +hash+ A map from attribute or method name to Header column name
 * +array+ A list in the same order that is used to display record data

:procs If a hash, then the keys are the names. If an array, then use proc_1, proc_2, etc :text_format Format names with a text format such as :titlieze, :dasherize, :underscore

# File 'lib/ar_dumper_base.rb', line 354

def build_header_list#:nodoc:


  header_options = options[:header]
  columns = @options[:attributes] + @options[:methods]
  header_names = 
    if header_options.is_a?(Hash)
      header_options.symbolize_keys!
      
      #Get the header for each attribute and method

      columns.collect{|field|(header_options[field.to_sym]||field).to_s}
      
    #ordered by attributes, methods, then procs

    elsif header_options.is_a?(Array)
      header_names = header_options
      header_names.concat(columns[header_options.length..-1]) if header_names.length < columns.length
      
    #default to column names 

    else
      columns
    end
  
  #add process names

  header_names.concat(options[:proc_headers])
  
  #format names with a text format such as titlieze, dasherize, underscore

  header_names.collect!{|n| n.to_s.send(options[:text_format])} if options[:text_format]
  
  header_names
end

#csv_writer ⇒ `Object`

Try to use the FasterCSV if it exists otherwise use csv

# File 'lib/ar_dumper_base.rb', line 330

def csv_writer #:nodoc:

  unless @@csv_writer
    @@csv_writer = :faster
    begin 
      require 'faster_csv'#:nodoc:

      ::FasterCSV
    rescue Exception => exc
      @@csv_writer = :normal
    end
  end
  @@csv_writer
end

#dump(format) ⇒ `Object`

Dump to the appropriate format

# File 'lib/ar_dumper_base.rb', line 138

def dump(format)#:nodoc:


  case format.to_sym
    when :csv
      dump_to_csv
      
    when :xml
      dump_to_xml
      
    when :yaml, :fixture, :yml
      dump_to_fixture
      
    else
      raise ArDumperException.new("Unknown format #{format}. Please specify :csv, :xml, or :yml ")
  end
end

#dump_record(record) ⇒ `Object`

collect the record data into an array

# File 'lib/ar_dumper_base.rb', line 189

def dump_record(record)#:nodoc:

  record_values = @options[:attributes].inject([]){|values, attr| values << record["#{attr}"]; values }
  record_values = @options[:methods].inject(record_values) {|values, method| values << record.send(method); values }
  record_values = @options[:procs].inject(record_values){|values, proc| values << proc.call(options); values }
  record_values
end

#dump_to_csv ⇒ `Object`

CSV DUMPER

Dump csv data

:csv - any options to pass to csv parser. :col_sep Example + :csv => => “t” + :row_sep Row seperator
:page_size - the page size to use. Defaults to dumper_page_size or 50

# File 'lib/ar_dumper_base.rb', line 304

def dump_to_csv#:nodoc:

  header = nil
  @options[:csv]||={}
  
  if !@options[:header].is_a?(FalseClass)
    header_list = build_header_list
    #print the header unless set to false

    header = write_csv_row(header_list)
  end
  
  dumper(:csv, header) do |record|
    options[:target] << write_csv_row(dump_record(record))
  end
end

#dump_to_fixture ⇒ `Object`

Yaml/Fixture Dumper

dumps the data to a fixture file In addition to options listed in dumper: :root Basename of the record. Defaults to the class name so each record is named customer_1

# File 'lib/ar_dumper_base.rb', line 279

def dump_to_fixture#:nodoc:

  basename = @options[:root]||@klass.table_name.singularize
  header_list = build_header_list

  # doctor the yaml a bit to print the hash header at the top

  # instead of each record

  dumper(:yml, "---\s") do |record|
    record_data = Hash.new
    dump_record(record).each_with_index{|field, idx| record_data[header_list[idx].to_s] = field.to_s }
    options[:target] << {"#{basename}_#{record.id}" => record_data}.to_yaml.gsub(/^---\s\n/, "\n")
  end
end

#dump_to_xml ⇒ `Object`

XML Dumper

Dumps the data to an xml file

Using the ActiveRecord version of dumper so we CANNOT specify fields that are not attributes

In addition to options listed in dumper: :xml - xml options for the xml_serializer. Includes :indent, :skip_instruct, :margin

Note that :procs will use the dumper proc and pass the dumper options

Book.dump :xml, :procs => {:topic_content => Proc.new { |options| options[:record].topic.content }}

To use xml proc, specify :xml => {:procs => array_of_procs}

Book.dump :xml, :xml => {:procs => [Proc.new{|xml_options| xml_options[:builder].tag 'abc', 'def'}]}

# File 'lib/ar_dumper_base.rb', line 212

def dump_to_xml#:nodoc:


  #preserve the original skip instruct

  skip_instruct = @options[:xml] && @options[:xml][:skip_instruct].is_a?(TrueClass)

  self.options[:procs]||= []

  #use the fields if :only is not specified in the xml options

  xml_options = {
    :only => @options[:only],
    :except => @options[:except],
    :methods => @options[:methods]
  }
  
  xml_options.update(@options[:xml]) if @options[:xml]

  #do not instruct for each set

  xml_options[:skip_instruct] = true
  xml_options[:indent]||=2
  xml_options[:margin] = xml_options[:margin].to_i + 1
  
  #set the variable on the options

  options[:xml] = xml_options
  
  #builder for header and footer

  builder_options = {
    :margin => xml_options[:margin] - 1,
    :indent => xml_options[:indent]
  }
  
  options[:root] = (options[:root] || @klass.to_s.underscore.pluralize).to_s

  #use the builder to make sure we are indented properly

  builder = Builder::XmlMarkup.new(builder_options.clone)
  builder.instruct! unless skip_instruct
  builder << "<#{options[:root]}>\n"
  header = builder.target!
  
  #get the footer. Using the builder will make sure we are indented properly

  builder = Builder::XmlMarkup.new(builder_options)
  builder << "</#{options[:root]}>"
  footer = builder.target!

  dumper(:xml, header, footer) do |record|
    options[:target] << serialize_record_dump_xml(record, xml_options)
  end
end

#dumper(file_extension = nil, header = nil, footer = nil, &block) ⇒ `Object`

Wrapper around the dump. The main dump functionality

# File 'lib/ar_dumper_base.rb', line 156

def dumper(file_extension=nil, header = nil, footer = nil, &block)#:nodoc:

  
  options[:counter] = -1
  begin
    #get the file parameters

    target = prepare_target(file_extension)
    target << header if header
    ArDumper.paginate_dump_records(@klass, @options) do |records, page_num|
    
      #save state on options to make it accessible by

      #class and procs

      options[:result_set] = records
      options[:page_num] = page_num
      
      records.each do |record|
        options[:record] = record
        yield record
      end
      
      #flush after each set

      target.flush if target.respond_to?(:flush)
    end
    target << footer if footer
    
  #final step close the options[:target]

  ensure
    target.close if target && target.respond_to?(:close)
  end

  options[:full_file_name]||target
end

#prepare_target(file_extension = nil) ⇒ `Object`

Create the options(file) based on these options. The options must respond to << Current optionss are :string, :file, :tempfile

:target_type The options for the data. Defaults to :file

* :string prints to string. Do not use with large data sets
* :tmp_file. Use a temporary file that is destroyed when the process exists
* :file. Use a standard file

:filename basename of the file. Defaults to random time based string for non-temporary files :file_extension Extension (suffix) like .csv, .xml. Added only if the basename has no suffix.

:file_extension is only available when +:target_type_type => :file+

:file_path path or directory of the file. Defaults to dumper_file_path or temporary directories

# File 'lib/ar_dumper_base.rb', line 400

def prepare_target(file_extension = nil)#:nodoc:

  
  options[:target] = case options[:target_type]
    #to string option dumps to a string instead of a file

    when :string
      String.new
      
    #use a temporary file

    #open a temporary file with the basename specified by filename

    #defaults to the value of one of the environment variables TMPDIR, TMP, or TEMP

    when :tmp_file

      Tempfile.open(options[:filename]||(@@dumper_tmp_file_basename+@klass.name.downcase), 
                    options[:file_path]||@@dumper_file_path)
                    
    #default to a real file                

    else
      extension = options[:file_extension]||file_extension
      mode = options[:append_to_file].is_a?(TrueClass)? 'a' : 'w'
      filename = options[:filename]||"#{@@dumper_tmp_file_basename}.#{@klass.name.downcase}.#{Time.now.to_f.to_s}.#{extension}"
      
      #append an extension unless one already exists

      filename += ".#{extension}" if extension && !filename =~ /\.\w*$/
      
      #get the file path if the filename does not contain one

      if File.basename(filename) == filename
        path = options[:file_path]||@@dumper_file_path 
        filename = File.join(path, filename) unless path.blank?
      end

      
      File.open(filename, mode)
   end
   
  options[:full_file_name] = options[:target].path if options[:target].respond_to?(:path)
  options[:target]
end

#serialize_record_dump_xml(record, xml_options) ⇒ `Object`

Serialize the xml data for the given record

# File 'lib/ar_dumper_base.rb', line 261

def serialize_record_dump_xml(record, xml_options)#:nodoc:


  serializer = ActiveRecord::XmlSerializer.new(record, xml_options.dup)
  xml = serializer.to_s do |builder|
    self.options[:procs].each_with_index do |proc, idx|
      serializer.add_tag_for_value(self.options[:proc_headers][idx].to_s,
                                   proc.call(self.options))
    end
  end
end

#write_csv_row(row_data, header_list = []) ⇒ `Object`

Write out the csv row using the selected csv writer

# File 'lib/ar_dumper_base.rb', line 320

def write_csv_row(row_data, header_list=[])#:nodoc: 

  if csv_writer == :faster
    ::FasterCSV::Row.new(header_list, row_data).to_csv(@options[:csv])
  else
    ::CSV.generate_line(row_data, @options[:csv][:col_sep], @options[:csv][:row_sep]) + (@options[:csv][:row_sep]||"\n")
  end
end

Class: ArDumper

Overview

Specify Output

Attributes (Only and Exclude)

Methods

Proc

Proc Options Hash

Finder Methods

Format Header

Filename and Target

Format specific options

Installation

Developers

Homepage

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(klass, dump_options = {}) ⇒ ArDumper

Instance Attribute Details

#fields ⇒ Object (readonly)

#klass ⇒ Object (readonly)

#options ⇒ Object (readonly)

Class Method Details

.compute_page_size(max_records, page_num, page_size) ⇒ Object

.paginate_dump_records(klass, options = {}, &block) ⇒ Object

.paginate_each_record(klass, options = {}, &block) ⇒ Object

Instance Method Details

#build_attribute_list ⇒ Object

#build_header_list ⇒ Object

#csv_writer ⇒ Object

#dump(format) ⇒ Object

#dump_record(record) ⇒ Object

#dump_to_csv ⇒ Object

#dump_to_fixture ⇒ Object

#dump_to_xml ⇒ Object

#dumper(file_extension = nil, header = nil, footer = nil, &block) ⇒ Object

#prepare_target(file_extension = nil) ⇒ Object

#serialize_record_dump_xml(record, xml_options) ⇒ Object

#write_csv_row(row_data, header_list = []) ⇒ Object