Class: Chicago::Schema::Dimension

Inherits:
Table
  • Object
show all
Defined in:
lib/chicago/schema/dimension.rb

Overview

A dimension in the star schema.

Dimensions contain denormalized values from various source systems, and are used to group and filter the fact tables. They may also be queried themselves.

You shouldn’t need to initialize a Dimension yourself - they should be created via StarSchema#define_dimension.

Instance Attribute Summary collapse

Attributes inherited from Table

#description, #natural_key, #table_name

Attributes included from NamedElement

#label, #name

Instance Method Summary collapse

Methods inherited from Table

#[], #qualify

Constructor Details

#initialize(name, opts = {}) ⇒ Dimension

Creates a new Dimension, named name.

Parameters:

  • name

    the name of the dimension

  • opts (Hash) (defaults to: {})

    a customizable set of options

Options Hash (opts):

  • columns (Array)
  • identifiers (Array)
  • null_records (Array)

    an array of attribute hashes, used to create null record rows in the database. Hashes must have an :id key.

  • natual_key (Array<Symbol>)

    an array of symbols, representing a uniqueness constraint on the dimension.

  • description (Object)

    a long text description about the dimension.

Raises:



56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# File 'lib/chicago/schema/dimension.rb', line 56

def initialize(name, opts={})
  super
  @columns = opts[:columns] || []
  @identifiers = opts[:identifiers] || []
  @null_records = opts[:null_records] || []
  @null_records.product(columns).each do |record, column|
    record[column.name] = column.default_value unless record.has_key?(column.name)
  end

  @table_name = sprintf(DIMENSION_TABLE_FORMAT, name).to_sym
  @key_table_name = sprintf(KEY_TABLE_FORMAT, @table_name).to_sym
  @predetermined_values = !! opts[:predetermined_values]
  @countable = !opts[:uncountable]
  check_null_records
end

Instance Attribute Details

#columnsObject (readonly) Also known as: column_definitions

Returns an array of Columns defined on this dimension.

See Also:



25
26
27
# File 'lib/chicago/schema/dimension.rb', line 25

def columns
  @columns
end

#identifiersObject (readonly)

Returns all the human-friendly identifying columns for this dimension.

There is no expectation that identifying values will be unique, but they are intended to identify a single record in a user friendly way.



36
37
38
# File 'lib/chicago/schema/dimension.rb', line 36

def identifiers
  @identifiers
end

#key_table_nameObject (readonly)

The table used to generate/store dimension keys.



39
40
41
# File 'lib/chicago/schema/dimension.rb', line 39

def key_table_name
  @key_table_name
end

#null_recordsObject (readonly)

Records representing missing or not applicable dimension values.



42
43
44
# File 'lib/chicago/schema/dimension.rb', line 42

def null_records
  @null_records
end

Instance Method Details

#countable?Boolean

Returns true if these dimension entries can be counted.

Returns:

  • (Boolean)


115
116
117
# File 'lib/chicago/schema/dimension.rb', line 115

def countable?
  @countable && identifiable?
end

#create_null_records(db, overridden_table_name = nil) ⇒ Object

Creates null records in a Database.

This will overwrite any records that share the id with the null record, so be careful.

Optionally provide an overridden table name, if you need to create null records for a temporary version of the table.



79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
# File 'lib/chicago/schema/dimension.rb', line 79

def create_null_records(db, overridden_table_name=nil)
  table_to_populate = overridden_table_name || table_name

  unless @null_records.empty?
    begin
      db[table_to_populate].insert_replace.
        multi_insert(@null_records)
    rescue Exception => e
      raise "Cannot populate null records for dimension #{name} (table #{table_to_populate})\n #{e.message}"
    end

    begin
      if db.table_exists?(key_table_name)
        ids = @null_records.map {|r| {:dimension_id => r[:id], :original_id => r[:original_id] || 0} }
        db[key_table_name].insert_replace.multi_insert(ids)
      end
    rescue Exception => e
      raise "Cannot populate key table records for dimension #{name} (table #{table_to_populate})\n #{e.message}"
    end
  end
end

#has_predetermined_values?Boolean

Returns true if the set of values for this dimension is pretermined.

Examples of this may be date dimensions, currency dimensions etc.

Returns:

  • (Boolean)


124
125
126
# File 'lib/chicago/schema/dimension.rb', line 124

def has_predetermined_values?
  @predetermined_values
end

#identifiable?Boolean

TODO:

change to be consistent with identifiers

Returns true if this dimension can be identified as a concrete entity, with an original_id from a source system.

Returns:

  • (Boolean)


110
111
112
# File 'lib/chicago/schema/dimension.rb', line 110

def identifiable?
  !! original_key
end

#main_identifierObject

Returns the main identifier for this record.



102
103
104
# File 'lib/chicago/schema/dimension.rb', line 102

def main_identifier
  @identifiers.first
end

#original_keyObject

TODO:

make configurable.

Returns the column that represents the id in the original source for the dimension.

Currently this column must be called original_id



134
135
136
# File 'lib/chicago/schema/dimension.rb', line 134

def original_key
  @original_key ||= @columns.detect {|c| c.name == :original_id }
end

#visit(visitor) ⇒ Object

Dimensions accept Visitors



139
140
141
# File 'lib/chicago/schema/dimension.rb', line 139

def visit(visitor)
  visitor.visit_dimension(self)
end