Module: VectorMCP::ImageUtil

Defined in:: lib/vector_mcp/image_util.rb

Overview

Provides comprehensive image handling utilities for VectorMCP operations, including format detection, validation, encoding/decoding, and conversion to MCP-compliant image content format.

Constant Summary collapse

IMAGE_SIGNATURES = Common image MIME types and their magic byte signatures

{
  "image/jpeg" => [
    [0xFF, 0xD8, 0xFF].pack("C*"),
    [0xFF, 0xD8].pack("C*")
  ],
  "image/png" => [[0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A].pack("C*")],
  "image/gif" => %w[
    GIF87a
    GIF89a
  ],
  "image/webp" => [
    "WEBP"
  ],
  "image/bmp" => [
    "BM"
  ],
  "image/tiff" => [
    "II*\0",
    "MM\0*"
  ]
}.freeze

DEFAULT_MAX_SIZE = Maximum image size in bytes (default: 10MB)

10 * 1024 * 1024

Class Method Summary collapse

.base64_string?(string) ⇒ Boolean

Checks if a string appears to be base64 encoded.
.decode_base64(base64_string) ⇒ String

Decodes base64 string to binary image data.
.detect_image_format(data) ⇒ String^?

Detects the MIME type of image data based on magic bytes.
.determine_final_mime_type(explicit_mime_type, detected_mime_type) ⇒ Object private

Determines the final MIME type to use, preferring explicit over detected.
.encode_base64(data) ⇒ String

Encodes binary image data to base64 string.
.extract_dimensions(data, mime_type) ⇒ Hash

Extracts basic image dimensions for common formats.
.extract_gif_dimensions(data) ⇒ Object

Extracts GIF dimensions from header.
.extract_jpeg_dimensions(data) ⇒ Object

Extracts JPEG dimensions from SOF marker.
.extract_metadata(data) ⇒ Hash

Extracts image metadata from binary data.
.extract_png_dimensions(data) ⇒ Object

Extracts PNG dimensions from IHDR chunk.
.file_to_mcp_image_content(file_path, validate: true, max_size: DEFAULT_MAX_SIZE, base_directory: nil) ⇒ Hash

Converts file path to MCP-compliant image content.
.process_image_data(data) ⇒ Object private

Processes input data to extract both binary and base64 representations.
.to_mcp_image_content(data, mime_type: nil, validate: true, max_size: DEFAULT_MAX_SIZE) ⇒ Hash

Converts image data to MCP-compliant image content format.
.validate_and_detect_format(binary_data, validate, max_size) ⇒ Object private

Validates image data and detects MIME type if validation is enabled.
.validate_image(data, max_size: DEFAULT_MAX_SIZE, allowed_formats: nil) ⇒ Hash

Validates if the provided data is a valid image.
.validate_path_safety!(file_path, base_directory) ⇒ Object private

Validates that a file path does not escape the given base directory.

Class Method Details

.base64_string?(string) ⇒ `Boolean`

Checks if a string appears to be base64 encoded.

Parameters:

string (String) —

The string to check.

Returns:

(Boolean) —

True if the string appears to be base64.

# File 'lib/vector_mcp/image_util.rb', line 286

def base64_string?(string)
  return false if string.nil? || string.empty?

  # Base64 strings should only contain valid base64 characters
  # and be properly padded with correct length
  return false unless string.match?(%r{\A[A-Za-z0-9+/]*={0,2}\z})

  # Allow both padded and unpadded base64, but require proper structure
  # For unpadded base64, length should be at least 4 and not result in invalid decoding
  if string.include?("=")
    # Padded base64 must be multiple of 4
    (string.length % 4).zero?
  else
    # Unpadded base64 - try to decode to see if it's valid
    return false if string.length < 4

    begin
      # Add padding and try to decode
      padded = string + ("=" * (4 - (string.length % 4)) % 4)
      Base64.strict_decode64(padded)
      true
    rescue ArgumentError
      false
    end
  end
end

.decode_base64(base64_string) ⇒ `String`

Decodes base64 string to binary image data.

Examples:

data = VectorMCP::ImageUtil.decode_base64(encoded_string)

Parameters:

base64_string (String) —

Base64 encoded image data.

Returns:

(String) —

Binary image data.

Raises:

(ArgumentError) —

If base64 string is invalid.

# File 'lib/vector_mcp/image_util.rb', line 132

def decode_base64(base64_string)
  Base64.strict_decode64(base64_string)
rescue ArgumentError => e
  raise ArgumentError, "Invalid base64 encoding: #{e.message}"
end

.detect_image_format(data) ⇒ `String`^?

Detects the MIME type of image data based on magic bytes.

Examples:

VectorMCP::ImageUtil.detect_image_format(File.binread("image.jpg"))
# => "image/jpeg"

Parameters:

data (String) —

The binary image data.

Returns:

(String, nil) —

The detected MIME type, or nil if not recognized.

# File 'lib/vector_mcp/image_util.rb', line 47

def detect_image_format(data)
  return nil if data.nil? || data.empty?

  # Ensure we have binary data (dup to avoid modifying frozen strings)
  binary_data = data.dup.force_encoding(Encoding::ASCII_8BIT)

  IMAGE_SIGNATURES.each do |mime_type, signatures|
    signatures.each do |signature|
      case mime_type
      when "image/webp"
        # WebP files start with RIFF then have WEBP at offset 8
        return mime_type if binary_data.start_with?("RIFF") && binary_data[8, 4] == signature
      else
        return mime_type if binary_data.start_with?(signature)
      end
    end
  end

  nil
end

.determine_final_mime_type(explicit_mime_type, detected_mime_type) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Determines the final MIME type to use, preferring explicit over detected.

Raises:

(ArgumentError)

# File 'lib/vector_mcp/image_util.rb', line 208

def determine_final_mime_type(explicit_mime_type, detected_mime_type)
  final_mime_type = explicit_mime_type || detected_mime_type
  raise ArgumentError, "Could not determine image MIME type" if final_mime_type.nil?

  final_mime_type
end

.encode_base64(data) ⇒ `String`

Encodes binary image data to base64 string.

Examples:

encoded = VectorMCP::ImageUtil.encode_base64(File.binread("image.jpg"))

Parameters:

data (String) —

The binary image data.

Returns:

(String) —

Base64 encoded string.



120
121
122

# File 'lib/vector_mcp/image_util.rb', line 120

def encode_base64(data)
  Base64.strict_encode64(data)
end

.extract_dimensions(data, mime_type) ⇒ `Hash`

Extracts basic image dimensions for common formats. This is a simplified implementation; for production use, consider using a proper image library like MiniMagick or ImageMagick.

Parameters:

data (String) —

Binary image data.
mime_type (String) —

Detected MIME type.

Returns:

(Hash) —

Hash containing width/height if detectable.

# File 'lib/vector_mcp/image_util.rb', line 320

def extract_dimensions(data, mime_type)
  case mime_type
  when "image/png"
    extract_png_dimensions(data)
  when "image/jpeg"
    extract_jpeg_dimensions(data)
  when "image/gif"
    extract_gif_dimensions(data)
  else
    {}
  end
rescue StandardError
  {} # Return empty hash if dimension extraction fails
end

.extract_gif_dimensions(data) ⇒ `Object`

Extracts GIF dimensions from header.

# File 'lib/vector_mcp/image_util.rb', line 371

def extract_gif_dimensions(data)
  return {} unless data.length > 10

  # GIF dimensions are at bytes 6-9
  width = data[6, 2].unpack1("v")  # Little-endian
  height = data[8, 2].unpack1("v") # Little-endian

  { width: width, height: height }
end

.extract_jpeg_dimensions(data) ⇒ `Object`

Extracts JPEG dimensions from SOF marker.

# File 'lib/vector_mcp/image_util.rb', line 349

def extract_jpeg_dimensions(data)
  # Simple JPEG dimension extraction
  # Look for SOF0 (Start of Frame) marker
  offset = 2
  while offset < data.length - 8
    marker = data[offset, 2].unpack1("n")
    length = data[offset + 2, 2].unpack1("n")

    # SOF0 marker (0xFFC0)
    if marker == 0xFFC0
      height = data[offset + 5, 2].unpack1("n")
      width = data[offset + 7, 2].unpack1("n")
      return { width: width, height: height }
    end

    offset += 2 + length
  end

  {}
end

.extract_metadata(data) ⇒ `Hash`

Extracts image metadata from binary data.

Examples:

metadata = VectorMCP::ImageUtil.extract_metadata(image_data)
# => { mime_type: "image/jpeg", size: 102400, format: "JPEG" }

Parameters:

data (String) —

Binary image data.

Returns:

(Hash) —

Metadata hash with available information.

# File 'lib/vector_mcp/image_util.rb', line 267

def extract_metadata(data)
  return {} if data.nil? || data.empty?

  mime_type = detect_image_format(data)
  metadata = {
    size: data.bytesize,
    mime_type: mime_type
  }

  metadata[:format] = mime_type.split("/").last.upcase if mime_type

  # Add basic dimension detection for common formats
  metadata.merge!(extract_dimensions(data, mime_type))
end

.extract_png_dimensions(data) ⇒ `Object`

Extracts PNG dimensions from IHDR chunk.

# File 'lib/vector_mcp/image_util.rb', line 338

def extract_png_dimensions(data)
  return {} unless data.length > 24

  # PNG IHDR chunk starts at byte 16 and contains width/height
  width = data[16, 4].unpack1("N")
  height = data[20, 4].unpack1("N")

  { width: width, height: height }
end

.file_to_mcp_image_content(file_path, validate: true, max_size: DEFAULT_MAX_SIZE, base_directory: nil) ⇒ `Hash`

Converts file path to MCP-compliant image content.

Examples:

content = VectorMCP::ImageUtil.file_to_mcp_image_content("./avatar.png")

With path traversal protection

content = VectorMCP::ImageUtil.file_to_mcp_image_content(
  user_input_path,
  base_directory: "/app/uploads"
)

Parameters:

file_path (String) —

Path to the image file.
validate (Boolean) (defaults to: true) —

Whether to validate the image.
max_size (Integer) (defaults to: DEFAULT_MAX_SIZE) —

Maximum allowed size for validation.
base_directory (String, nil) (defaults to: nil) —

Optional base directory for path traversal protection. When provided, the resolved file_path must reside within this directory.

Returns:

(Hash) —

MCP image content hash.

Raises:

(ArgumentError) —

If file doesn’t exist, validation fails, or path traversal is detected.

# File 'lib/vector_mcp/image_util.rb', line 233

def file_to_mcp_image_content(file_path, validate: true, max_size: DEFAULT_MAX_SIZE, base_directory: nil)
  validate_path_safety!(file_path, base_directory) if base_directory

  raise ArgumentError, "Image file not found: #{file_path}" unless File.exist?(file_path)

  raise ArgumentError, "Image file not readable: #{file_path}" unless File.readable?(file_path)

  binary_data = File.binread(file_path)
  to_mcp_image_content(binary_data, validate: validate, max_size: max_size)
end

.process_image_data(data) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Processes input data to extract both binary and base64 representations.

# File 'lib/vector_mcp/image_util.rb', line 173

def process_image_data(data)
  is_base64 = base64_string?(data)

  if is_base64
    # Decode to validate and detect format
    begin
      binary_data = decode_base64(data)
      base64_data = data
    rescue ArgumentError => e
      raise ArgumentError, "Invalid base64 image data: #{e.message}"
    end
  else
    # Assume binary data (dup to avoid modifying frozen strings)
    binary_data = data.dup.force_encoding(Encoding::ASCII_8BIT)
    base64_data = encode_base64(binary_data)
  end

  [binary_data, base64_data]
end

.to_mcp_image_content(data, mime_type: nil, validate: true, max_size: DEFAULT_MAX_SIZE) ⇒ `Hash`

Converts image data to MCP-compliant image content format.

Examples:

Convert binary image data

content = VectorMCP::ImageUtil.to_mcp_image_content(
  File.binread("image.jpg")
)
# => { type: "image", data: "base64...", mimeType: "image/jpeg" }

Convert base64 string with explicit MIME type

content = VectorMCP::ImageUtil.to_mcp_image_content(
  base64_string,
  mime_type: "image/png",
  validate: false
)

Parameters:

data (String) —

Binary image data or base64 encoded string.
mime_type (String, nil) (defaults to: nil) —

MIME type (auto-detected if nil).
validate (Boolean) (defaults to: true) —

Whether to validate the image data.
max_size (Integer) (defaults to: DEFAULT_MAX_SIZE) —

Maximum allowed size for validation.

Returns:

(Hash) —

MCP image content hash with :type, :data, and :mimeType.

Raises:

(ArgumentError) —

If validation fails.

# File 'lib/vector_mcp/image_util.rb', line 159

def to_mcp_image_content(data, mime_type: nil, validate: true, max_size: DEFAULT_MAX_SIZE)
  binary_data, base64_data = process_image_data(data)
  detected_mime_type = validate_and_detect_format(binary_data, validate, max_size)
  final_mime_type = determine_final_mime_type(mime_type, detected_mime_type)

  {
    type: "image",
    data: base64_data,
    mimeType: final_mime_type
  }
end

.validate_and_detect_format(binary_data, validate, max_size) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Validates image data and detects MIME type if validation is enabled.

# File 'lib/vector_mcp/image_util.rb', line 195

def validate_and_detect_format(binary_data, validate, max_size)
  if validate
    validation = validate_image(binary_data, max_size: max_size)
    raise ArgumentError, "Image validation failed: #{validation[:errors].join(", ")}" unless validation[:valid]

    validation[:mime_type]
  else
    detect_image_format(binary_data)
  end
end

.validate_image(data, max_size: DEFAULT_MAX_SIZE, allowed_formats: nil) ⇒ `Hash`

Validates if the provided data is a valid image.

Examples:

result = VectorMCP::ImageUtil.validate_image(image_data)
if result[:valid]
  puts "Valid #{result[:mime_type]} image"
else
  puts "Errors: #{result[:errors].join(', ')}"
end

Parameters:

data (String) —

The binary image data.
max_size (Integer) (defaults to: DEFAULT_MAX_SIZE) —

Maximum allowed size in bytes.
allowed_formats (Array<String>) (defaults to: nil) —

Allowed MIME types.

Returns:

(Hash) —

Validation result with :valid, :mime_type, and :errors keys.

# File 'lib/vector_mcp/image_util.rb', line 82

def validate_image(data, max_size: DEFAULT_MAX_SIZE, allowed_formats: nil)
  errors = []

  if data.nil? || data.empty?
    errors << "Image data is empty"
    return { valid: false, mime_type: nil, errors: errors }
  end

  # Check file size
  errors << "Image size (#{data.bytesize} bytes) exceeds maximum allowed size (#{max_size} bytes)" if data.bytesize > max_size

  # Detect format
  mime_type = detect_image_format(data)
  if mime_type.nil?
    errors << "Unrecognized or invalid image format"
    return { valid: false, mime_type: nil, errors: errors }
  end

  # Check allowed formats
  if allowed_formats && !allowed_formats.include?(mime_type)
    errors << "Image format #{mime_type} is not allowed. Allowed formats: #{allowed_formats.join(", ")}"
  end

  {
    valid: errors.empty?,
    mime_type: mime_type,
    size: data.bytesize,
    errors: errors
  }
end

.validate_path_safety!(file_path, base_directory) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Validates that a file path does not escape the given base directory.

Parameters:

file_path (String) —

The file path to validate.
base_directory (String) —

The base directory boundary.

Raises:

(ArgumentError) —

If the resolved path is outside base_directory.

# File 'lib/vector_mcp/image_util.rb', line 250

def validate_path_safety!(file_path, base_directory)
  resolved_base = File.expand_path(base_directory)
  resolved_path = File.expand_path(file_path, resolved_base)

  return if resolved_path.start_with?("#{resolved_base}/") || resolved_path == resolved_base

  raise ArgumentError, "Path traversal detected: resolved path is outside the allowed base directory"
end

Module: VectorMCP::ImageUtil

Overview

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.base64_string?(string) ⇒ Boolean

.decode_base64(base64_string) ⇒ String

Examples:

.detect_image_format(data) ⇒ String?

Examples:

.determine_final_mime_type(explicit_mime_type, detected_mime_type) ⇒ Object

.encode_base64(data) ⇒ String

Examples:

.extract_dimensions(data, mime_type) ⇒ Hash

.extract_gif_dimensions(data) ⇒ Object

.extract_jpeg_dimensions(data) ⇒ Object

.extract_metadata(data) ⇒ Hash

Examples:

.extract_png_dimensions(data) ⇒ Object

.file_to_mcp_image_content(file_path, validate: true, max_size: DEFAULT_MAX_SIZE, base_directory: nil) ⇒ Hash

Examples:

With path traversal protection

.process_image_data(data) ⇒ Object

.to_mcp_image_content(data, mime_type: nil, validate: true, max_size: DEFAULT_MAX_SIZE) ⇒ Hash

Examples:

Convert binary image data

Convert base64 string with explicit MIME type

.validate_and_detect_format(binary_data, validate, max_size) ⇒ Object

.validate_image(data, max_size: DEFAULT_MAX_SIZE, allowed_formats: nil) ⇒ Hash

Examples:

.validate_path_safety!(file_path, base_directory) ⇒ Object

.base64_string?(string) ⇒ `Boolean`

.decode_base64(base64_string) ⇒ `String`

.detect_image_format(data) ⇒ `String`^?

.determine_final_mime_type(explicit_mime_type, detected_mime_type) ⇒ `Object`

.encode_base64(data) ⇒ `String`

.extract_dimensions(data, mime_type) ⇒ `Hash`

.extract_gif_dimensions(data) ⇒ `Object`

.extract_jpeg_dimensions(data) ⇒ `Object`

.extract_metadata(data) ⇒ `Hash`

.extract_png_dimensions(data) ⇒ `Object`

.file_to_mcp_image_content(file_path, validate: true, max_size: DEFAULT_MAX_SIZE, base_directory: nil) ⇒ `Hash`

.process_image_data(data) ⇒ `Object`

.to_mcp_image_content(data, mime_type: nil, validate: true, max_size: DEFAULT_MAX_SIZE) ⇒ `Hash`

.validate_and_detect_format(binary_data, validate, max_size) ⇒ `Object`

.validate_image(data, max_size: DEFAULT_MAX_SIZE, allowed_formats: nil) ⇒ `Hash`

.validate_path_safety!(file_path, base_directory) ⇒ `Object`