Module: Audumbla::Enrichment

Included in:
FieldEnrichment
Defined in:
lib/audumbla/enrichment.rb

Overview

Mixin module for enriching a set of input_fields and setting the resulting values to a set of output fields.

Instance Method Summary collapse

Instance Method Details

#enrich(record, input_fields, output_fields) ⇒ ActiveTriples::Resource

The main enrichment method; passes specified input fields to #enrich_values, which must return an array of values with length equal to the number of output fields. The values of the output fields are set to the corresponding result from the enrichment.

Pass fields to ‘input_fields` and `output_fields`. Fields are formatted as symbols in nested hashes, targeting a particular field in an ActiveTriples Resource property hierarchy:

:sourceResource
{:sourceResource => :spatial}
{:sourceResource => {:creator => :name}}

The record passed in is not altered, but cloned before the enrichment is applied. A common pattern may be:

record = my_enrichment.enrich(record, input, output)
record.persist!

Input fields create an array selecting the values of all matching fields. For example:

an array of values from record.sourceResource:

:sourceResource

an array of values combining spatial fields from the values of record.sourceResource:

{:sourceResource => :spatial}

an array of values combining name fields from the creators in record.sourceResource:

{:sourceResource => {:creator => :name}}

Output fields should be specified at a high enough level that the enrichment can build a complete value set from the input values provided. An enrichment for mapping names to LCSH URIs, that alters all creator fields might be formatted:

my_enrichment.enrich(record,
  [{:sourceResource => {:creator => :providedLabel}}],
  [{:sourceResource => :creator}])

This would pass the values like the following, sourced from the providedLabel, to #enrich_value:

[['Moomintroll', 'Moomin Papa', 'Moomin Mama']]

And it would expect to receive an array of values set directly to creator, overwriting all existing creator values:

[DPLA::MAP::Agent:0x3ff(default),
 DPLA::MAP::Agent:0x9f5(default),
 DPLA::MAP::Agent:0x3a8(default)]

Parameters:

  • record (ActiveTriples::Resource)

    the record to enrich

  • input_fields (Array)

    the fields whose values to pass to the enrichment method

  • output_fields (Array)

    the fields on which to apply the enrichment

Returns:

  • (ActiveTriples::Resource)

    the enriched record



67
68
69
# File 'lib/audumbla/enrichment.rb', line 67

def enrich(record, input_fields, output_fields)
  enrich!(record.clone, input_fields, output_fields)
end

#enrich!(record, input_fields, output_fields) ⇒ Object

Runs the enrichment directly on the given record.

See Also:



75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/audumbla/enrichment.rb', line 75

def enrich!(record, input_fields, output_fields)
  output_fields.map! { |f| field_to_chain(f) }

  values = values_from_fields(record, input_fields)
  values = enrich_value(values).dup

  raise 'field/value mismatch.' \
    "#{values.count} values for #{output_fields.count} fields." unless
    values.count == output_fields.count

  output_fields.each { |field| set_field(record, field, values.shift) }
  record
end

#enrich_value(_) ⇒ ActiveTriples::Resource

This method is abstract.

Runs the enrichment against a field

Accept an array of values from an ActiveTriples::Resource property, and return an array of values to set to output fields.

Parameters:

  • the (ActiveTriples::Resource, RDF::Literal)

    value(s) to process

Returns:

  • (ActiveTriples::Resource)

    the enriched record

Raises:

  • (NotImplementedError)


97
98
99
# File 'lib/audumbla/enrichment.rb', line 97

def enrich_value(_)
  raise NotImplementedError
end

#list_fields(record) ⇒ Object



101
102
103
104
105
106
107
108
109
110
111
112
# File 'lib/audumbla/enrichment.rb', line 101

def list_fields(record)
  fields = []
  record.class.properties.each do |prop, _|
    fields << prop.to_sym

    objs = resources(record.send(fields.last)).map { |r| list_fields(r) }
    next if objs.empty?

    objs.flatten.each { |obj| fields << { prop => obj } }
  end
  fields
end