Class: ScraperUtils::DateRangeUtils

Inherits:
Object
  • Object
show all
Defined in:
lib/scraper_utils/date_range_utils.rb

Constant Summary collapse

MERGE_ADJACENT_RANGES =
true
PERIODS =
[2, 3, 4].freeze

Class Attribute Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Class Attribute Details

.default_daysInteger

Returns Default number of days to cover.

Returns:

  • (Integer)

    Default number of days to cover



10
11
12
# File 'lib/scraper_utils/date_range_utils.rb', line 10

def default_days
  @default_days
end

.default_everytimeInteger

Returns Default days to always include in ranges.

Returns:

  • (Integer)

    Default days to always include in ranges



13
14
15
# File 'lib/scraper_utils/date_range_utils.rb', line 13

def default_everytime
  @default_everytime
end

.default_max_periodInteger?

Returns Default max days between any one date being in a range.

Returns:

  • (Integer, nil)

    Default max days between any one date being in a range



16
17
18
# File 'lib/scraper_utils/date_range_utils.rb', line 16

def default_max_period
  @default_max_period
end

Instance Attribute Details

#extended_max_periodObject (readonly)

Returns the value of attribute extended_max_period.



44
45
46
# File 'lib/scraper_utils/date_range_utils.rb', line 44

def extended_max_period
  @extended_max_period
end

#max_period_usedObject (readonly)

Returns the value of attribute max_period_used.



43
44
45
# File 'lib/scraper_utils/date_range_utils.rb', line 43

def max_period_used
  @max_period_used
end

Class Method Details

.configure {|self| ... } ⇒ void

This method returns an undefined value.

Configure default settings for all DateRangeUtils instances

Examples:

AgentConfig.configure do |config|
  config.default_everytime = 3
  config.default_days = 35
  config.default_max_period = 5
end

Yields:

  • (self)

    Yields self for configuration



27
28
29
# File 'lib/scraper_utils/date_range_utils.rb', line 27

def configure
  yield self if block_given?
end

.reset_defaults!void

This method returns an undefined value.

Reset all configuration options to their default values



33
34
35
36
37
# File 'lib/scraper_utils/date_range_utils.rb', line 33

def reset_defaults!
  @default_days = ENV.fetch('MORPH_DAYS', 33).to_i # 33
  @default_everytime = ENV.fetch('MORPH_EVERYTIME', 4).to_i # 4
  @default_max_period = ENV.fetch('MORPH_MAX_PERIOD', 2).to_i # 3
end

Instance Method Details

#calculate_date_ranges(days: nil, everytime: nil, max_period: nil, today: nil) ⇒ Array{[Date, Date, String]}

Generates one or more date ranges to check the most recent daily through to checking each max_period There is a graduated schedule from the latest ‘everytime` days through to the oldest of `days` dates which is checked each `max_period` days. Uses a Fibonacci sequence to create a natural progression of check frequencies. Newer data is checked more frequently, with periods between checks growing according to the Fibonacci sequence (2, 3, 5, 8, 13…) until reaching max_period. This creates an efficient schedule that mimics natural information decay patterns.

Parameters:

  • days (Integer, nil) (defaults to: nil)

    create ranges that cover the last ‘days` dates

  • everytime (Integer, nil) (defaults to: nil)

    Always include the latest ‘everytime` out of `days` dates (minimum 1)

  • max_period (Integer, nil) (defaults to: nil)

    the last ‘days` dates must be checked at least every `max_period` days (1..4)

  • today (Date, nil) (defaults to: nil)

    overrides the default determination of today at UTC+09:30 (middle of Australia)

Returns:

  • (Array{[Date, Date, String]})

    being from_date, to_date and a comment



58
59
60
61
62
63
64
65
# File 'lib/scraper_utils/date_range_utils.rb', line 58

def calculate_date_ranges(days: nil, everytime: nil, max_period: nil, today: nil)
  _calculate_date_ranges(
    Integer(days || self.class.default_days),
    [1, Integer(everytime || self.class.default_everytime)].max,
    Integer(max_period || self.class.default_max_period),
    today || Time.now(in: '+09:30').to_date
  )
end