Module: ScraperUtils::DebugUtils

Defined in:
lib/scraper_utils/debug_utils.rb

Overview

Utilities for debugging web scraping processes

Constant Summary collapse

DEBUG_ENV_VAR =
"DEBUG"
MORPH_DEBUG_ENV_VAR =
"MORPH_DEBUG"
DISABLED_LEVEL =

Debug level constants

0
BASIC_LEVEL =
1
VERBOSE_LEVEL =
2
TRACE_LEVEL =
3

Class Method Summary collapse

Class Method Details

.basic?Boolean

Check if basic debug output or higher is enabled

Returns:

  • (Boolean)

    true if debugging is enabled


35
36
37
# File 'lib/scraper_utils/debug_utils.rb', line 35

def self.basic?
  debug?(BASIC_LEVEL)
end

.debug?(level = BASIC_LEVEL) ⇒ Boolean

Check if debug is enabled at specified level or higher

Parameters:

  • level (Integer) (defaults to: BASIC_LEVEL)

    Minimum debug level to check for

Returns:

  • (Boolean)

    true if debugging at specified level is enabled


29
30
31
# File 'lib/scraper_utils/debug_utils.rb', line 29

def self.debug?(level = BASIC_LEVEL)
  debug_level >= level
end

.debug_levelInteger

Get current debug level (0 = disabled, 1 = basic, 2 = verbose, 3 = trace) Checks DEBUG and MORPH_DEBUG env variables

Returns:

  • (Integer)

    Debug level


20
21
22
23
# File 'lib/scraper_utils/debug_utils.rb', line 20

def self.debug_level
  debug = ENV.fetch(DEBUG_ENV_VAR, ENV.fetch(MORPH_DEBUG_ENV_VAR, '0'))
  debug =~ /^\d/ ? debug.to_i : BASIC_LEVEL
end

.debug_page(page, message) ⇒ void

This method returns an undefined value.

Logs details of a web page when debug mode is enabled

Parameters:

  • page (Mechanize::Page)

    The web page to debug

  • message (String)

    Context or description for the debug output


76
77
78
79
80
81
82
83
84
85
86
87
88
89
# File 'lib/scraper_utils/debug_utils.rb', line 76

def self.debug_page(page, message)
  return unless trace?

  puts
  LogUtils.log "🔍 DEBUG: #{message}"
  puts "Current URL: #{page.uri}"
  puts "Page title: #{page.at('title').text.strip}" if page.at("title")
  puts "",
       "Page content:",
       "-" * 40,
       page.body,
       "-" * 40
  $stdout.flush
end

.debug_request(http_method, url, parameters: nil, headers: nil, body: nil) ⇒ void

This method returns an undefined value.

Logs details of an HTTP request when debug mode is enabled

Parameters:

  • http_method (String)

    HTTP http_method (GET, POST, etc.)

  • url (String)

    Request URL

  • parameters (Hash, nil) (defaults to: nil)

    Optional request parameters

  • headers (Hash, nil) (defaults to: nil)

    Optional request headers

  • body (Hash, nil) (defaults to: nil)

    Optional request body


60
61
62
63
64
65
66
67
68
69
# File 'lib/scraper_utils/debug_utils.rb', line 60

def self.debug_request(http_method, url, parameters: nil, headers: nil, body: nil)
  return unless basic?

  puts
  LogUtils.log "🔍 #{http_method.upcase} #{url}"
  puts "Parameters:", JSON.pretty_generate(parameters) if parameters
  puts "Headers:", JSON.pretty_generate(headers) if headers
  puts "Body:", JSON.pretty_generate(body) if body
  $stdout.flush
end

.debug_selector(page, selector, message) ⇒ void

This method returns an undefined value.

Logs details about a specific page selector when debug mode is enabled

Parameters:

  • page (Mechanize::Page)

    The web page to inspect

  • selector (String)

    CSS selector to look for

  • message (String)

    Context or description for the debug output


97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
# File 'lib/scraper_utils/debug_utils.rb', line 97

def self.debug_selector(page, selector, message)
  return unless trace?

  puts
  LogUtils.log "🔍 DEBUG: #{message}"
  puts "Looking for selector: #{selector}"
  element = page.at(selector)
  if element
    puts "Found element:"
    puts element.to_html
  else
    puts "Element not found in:"
    puts "-" * 40
    puts page.body
    puts "-" * 40
  end
  $stdout.flush
end

.trace?Boolean

Check if debug tracing or higher is enabled

Returns:

  • (Boolean)

    true if debugging is enabled at trace level


47
48
49
# File 'lib/scraper_utils/debug_utils.rb', line 47

def self.trace?
  debug?(TRACE_LEVEL)
end

.verbose?Boolean

Check if verbose debug output or higher is enabled

Returns:

  • (Boolean)

    true if verbose debugging is enabled


41
42
43
# File 'lib/scraper_utils/debug_utils.rb', line 41

def self.verbose?
  debug?(VERBOSE_LEVEL)
end