Class: Bucket
- Inherits:
-
Object
- Object
- Bucket
- Defined in:
- lib/etl/bucket.rb
Overview
Sometimes I have data coming from several sources. I want to combine the sources and release a consolidated record. This is meant to work like that. For a weird example: >> my_hash = => ‘me’
> :surprise=>“me”
>> b = Bucket.new(my_hash) {|h| h.inject({}) {|hsh, e| hsh = e.last % 3; hsh}}
> #<Bucket:0x232d230 @raw_data=:surprise=>“me”, @filter_block=#<Proc:0x0232d26c@(irb):2>>
>> b.add :this => 1
> :this=>1
>> b.add OpenStruct.new(:this => 6)
> :this=>6
>> b.raw_data
> :this=>6
>> b.filtered_data
> :this=>0
>> b.dump
> :this=>0
>> b.raw_data
> {}
A more practical use that I have for this is with screen scraping, when I’m getting the source of some concept, I may ask the same site for information at different times, or ask complimentary sites for overlaying data. A much more practical use of this is with the TimeBucket. That is a bucket that creates a time series from observations that may be on very different time schedules.
Instance Attribute Summary collapse
-
#filter_block ⇒ Object
The block used to filter the bucket.
-
#raw_data ⇒ Object
(also: #to_hash)
readonly
The data in the bucket, as an OpenStruct.
-
#white_list ⇒ Object
(also: #labels)
Reveals the white list.
Instance Method Summary collapse
- #add(obj) ⇒ Object
- #dump ⇒ Object
- #filtered_data ⇒ Object
-
#initialize(obj = nil, &block) ⇒ Bucket
constructor
A new instance of Bucket.
-
#ordered_data ⇒ Object
Uses the facets/dictionary to deliver an ordered hash, in the order of the white list.
- #to_a ⇒ Object (also: #to_array)
-
#to_obj(klass, use_hash = false) ⇒ Object
(also: #to_struct)
Initializes a class with the values of the raw data.
- #to_open_struct ⇒ Object
Constructor Details
#initialize(obj = nil, &block) ⇒ Bucket
Returns a new instance of Bucket.
42 43 44 45 46 |
# File 'lib/etl/bucket.rb', line 42 def initialize(obj=nil, &block) @filter_block = block reset_bucket assert_object(obj) if obj end |
Instance Attribute Details
#filter_block ⇒ Object
The block used to filter the bucket. Useful for converting the data to a different data type. Examples: Return a hash b.filter_block = lambda{|o| o.table} Return an array b.filter_block = lambda{|o| o.table.values}
37 38 39 |
# File 'lib/etl/bucket.rb', line 37 def filter_block @filter_block end |
#raw_data ⇒ Object (readonly) Also known as: to_hash
The data in the bucket, as an OpenStruct
40 41 42 |
# File 'lib/etl/bucket.rb', line 40 def raw_data @raw_data end |
#white_list ⇒ Object Also known as: labels
Reveals the white list. If this is set, it is an array, and it not only filters the data in the bucket, but also orders it.
93 94 95 |
# File 'lib/etl/bucket.rb', line 93 def white_list @white_list end |
Instance Method Details
#add(obj) ⇒ Object
48 49 50 |
# File 'lib/etl/bucket.rb', line 48 def add(obj) assert_object(obj) end |
#dump ⇒ Object
52 53 54 55 56 |
# File 'lib/etl/bucket.rb', line 52 def dump data = self.raw_data reset_bucket filter(data) end |
#filtered_data ⇒ Object
58 59 60 |
# File 'lib/etl/bucket.rb', line 58 def filtered_data filter(self.raw_data) end |
#ordered_data ⇒ Object
Uses the facets/dictionary to deliver an ordered hash, in the order of the white list.
64 65 66 67 68 69 70 71 |
# File 'lib/etl/bucket.rb', line 64 def ordered_data return self.raw_data unless self.white_list dictionary = Dictionary.new self.white_list.each do |k| dictionary[k] = self.raw_data[k] end dictionary end |
#to_a ⇒ Object Also known as: to_array
73 74 75 |
# File 'lib/etl/bucket.rb', line 73 def to_a self.ordered_data.values end |
#to_obj(klass, use_hash = false) ⇒ Object Also known as: to_struct
Initializes a class with the values of the raw data. Good for structs and struct-like classes.
82 83 84 |
# File 'lib/etl/bucket.rb', line 82 def to_obj(klass, use_hash=false) use_hash ? klass.new(self.raw_data) : klass.new(*self.raw_data.values) end |
#to_open_struct ⇒ Object
87 88 89 |
# File 'lib/etl/bucket.rb', line 87 def to_open_struct OpenStruct.new(self.raw_data) end |