Rseed
Rseed is a featureful library and bunch of utilities to assist in importing mass data (or even a few lines of data) into your Rails project.
Rseed is a replacement for active_import (github.com/intrica/active_import). There are lots of improvements in order to make it easy to create and maintain converters.
Rseed can also import from excel files with the optional rseed-roo gem (github.com/intrica/rseed-roo).
Requirements
>= Rails 3.0 >= Ruby 1.9
Installation
Simple add the following to your Gemfile
gem 'rseed'
Then run:
bundle install
Quick Example
rails g rseed:converter HtmlColor
This will create an model converter in the directory app/rseed. You can read through this import file to see how the import works.
This also creates a default data file of html_colors.csv in db/rseed. This will be the CSV used for this converter.
Generators
The generator will automatically create attribute lines for all of the attributes in the model except :id, :created_at, :updated_at.
Generator options are as follows:
-
–attribute
This will set up the converter to do a first_or_initilize on the specified attribute instead of a new
-
–minimal
This will cause the generator to create a file with fewer comments and without reduntanct definitions in the columns.
The Converter File
<em>For some reason that we haven’t been able to pin down, Rails struggles with resolving constants within these converters. This does not affect the converters when running in production modes. The workaround for this is to prefix any of your
- constant names (such as model classes) with \
-
within your converters. This is done by default when using the generator</em>
Attribute Options
-
:header
Defines the name of the attribute to be used for serialization. If there is no :match defined, it will also be used to match the attribute name of the input to the attribute being defined.
-
:match
A regex string that is used to match the attribute name of the input to the attribute being defined. If this is not defined, a match will be checked against :header and then the attribute name.
-
:type
Defines a type for the string.
-
:model
This can be set to the name of a model that this attribute should resolve to. The model is classified so using a symbol works here. Alternately, if only the :model_attribute is set, the name of the attribute will be used as the model name.
-
:model_attribute
Specify which attribute on the model is used for lookup.
-
:model_match
Specifies how the model should be resolved. The value here is called against the where that is used to look up the model. For example, this defaults to :first. If your model is Person and the :model_attribute is :name then this is what is called to set the attribute value:
Person.where(name: <value>).first
You may use any active record method in this case, such as :first_or_create, or :last.
-
:optional
Defines the attribute as optionsal. This has no effect in the HashAdapter.
before_deserialize
If you define a function called before_serialize you can do any preprocessing you require. One example of this is marking an archive flag on existing data:
def before_deserialize
HtmlColor.where(import_archive: true).update_all({:import_archive => false})
true
end
Note that you must return true from this function. Returning false will cause the processor to give up and log an error. Thus you can also use the following:
def before_deserialize
return fail_with_error "Mandatory option is missing" unless ["mandatory_option"]
true
end
after_deserialize
You can define this function to be called at the end of processing. Following from the example above, if you set import_archive to be false for each model in the deserialize method, you could do the following to remove old
records:
def after_deserialize
HtmlColor.where(import_archive: true).destroy_all
end
This example is obviously fairly destructive and there are better ways to deal with this situation than destroying the records.
Rake Tasks
These rake tasks allow you to run seeds manually:
rake rseed:csv Load csv file into a model using a model converter
Examples
rake rseed:csv file=users.csv converter=User ="give_admin_access=true" ="col_sep=\t"
In this case the file in db/rseed/users.csv would be run through the converter UserConverter. The options specified are available within the converter. In this case options will evaluate to “true”.
Processor Options
-
:within_transaction
Setting this to true will wrap the entire deserialize in an transaction, avoiding calling a commit after each line. For large data sets, this should speed up insertion.
Seeding
If you want to seed your Rails application using Rseed. The best method is to add lines like the following to db/seeds.rb
Rseed::from_csv 'html_colors.csv', converter: :html_color, converter_options: { no_red: true }
You can always use if statements in this file to filter seeds by Rails.env or something similar.
Custom Type Conversions
TODO