Module: VectorMCP::ImageUtil

Defined in:
lib/vector_mcp/image_util.rb

Overview

Provides comprehensive image handling utilities for VectorMCP operations, including format detection, validation, encoding/decoding, and conversion to MCP-compliant image content format.

Constant Summary collapse

IMAGE_SIGNATURES =

Common image MIME types and their magic byte signatures

{
  "image/jpeg" => [
    [0xFF, 0xD8, 0xFF].pack("C*"),
    [0xFF, 0xD8].pack("C*")
  ],
  "image/png" => [[0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A].pack("C*")],
  "image/gif" => %w[
    GIF87a
    GIF89a
  ],
  "image/webp" => [
    "WEBP"
  ],
  "image/bmp" => [
    "BM"
  ],
  "image/tiff" => [
    "II*\0",
    "MM\0*"
  ]
}.freeze
DEFAULT_MAX_SIZE =

Maximum image size in bytes (default: 10MB)

10 * 1024 * 1024

Class Method Summary collapse

Class Method Details

.base64_string?(string) ⇒ Boolean

Checks if a string appears to be base64 encoded.

Parameters:

  • string (String)

    The string to check.

Returns:

  • (Boolean)

    True if the string appears to be base64.



286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
# File 'lib/vector_mcp/image_util.rb', line 286

def base64_string?(string)
  return false if string.nil? || string.empty?

  # Base64 strings should only contain valid base64 characters
  # and be properly padded with correct length
  return false unless string.match?(%r{\A[A-Za-z0-9+/]*={0,2}\z})

  # Allow both padded and unpadded base64, but require proper structure
  # For unpadded base64, length should be at least 4 and not result in invalid decoding
  if string.include?("=")
    # Padded base64 must be multiple of 4
    (string.length % 4).zero?
  else
    # Unpadded base64 - try to decode to see if it's valid
    return false if string.length < 4

    begin
      # Add padding and try to decode
      padded = string + ("=" * (4 - (string.length % 4)) % 4)
      Base64.strict_decode64(padded)
      true
    rescue ArgumentError
      false
    end
  end
end

.decode_base64(base64_string) ⇒ String

Decodes base64 string to binary image data.

Examples:

data = VectorMCP::ImageUtil.decode_base64(encoded_string)

Parameters:

  • base64_string (String)

    Base64 encoded image data.

Returns:

  • (String)

    Binary image data.

Raises:

  • (ArgumentError)

    If base64 string is invalid.



132
133
134
135
136
# File 'lib/vector_mcp/image_util.rb', line 132

def decode_base64(base64_string)
  Base64.strict_decode64(base64_string)
rescue ArgumentError => e
  raise ArgumentError, "Invalid base64 encoding: #{e.message}"
end

.detect_image_format(data) ⇒ String?

Detects the MIME type of image data based on magic bytes.

Examples:

VectorMCP::ImageUtil.detect_image_format(File.binread("image.jpg"))
# => "image/jpeg"

Parameters:

  • data (String)

    The binary image data.

Returns:

  • (String, nil)

    The detected MIME type, or nil if not recognized.



47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/vector_mcp/image_util.rb', line 47

def detect_image_format(data)
  return nil if data.nil? || data.empty?

  # Ensure we have binary data (dup to avoid modifying frozen strings)
  binary_data = data.dup.force_encoding(Encoding::ASCII_8BIT)

  IMAGE_SIGNATURES.each do |mime_type, signatures|
    signatures.each do |signature|
      case mime_type
      when "image/webp"
        # WebP files start with RIFF then have WEBP at offset 8
        return mime_type if binary_data.start_with?("RIFF") && binary_data[8, 4] == signature
      else
        return mime_type if binary_data.start_with?(signature)
      end
    end
  end

  nil
end

.determine_final_mime_type(explicit_mime_type, detected_mime_type) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Determines the final MIME type to use, preferring explicit over detected.

Raises:

  • (ArgumentError)


208
209
210
211
212
213
# File 'lib/vector_mcp/image_util.rb', line 208

def determine_final_mime_type(explicit_mime_type, detected_mime_type)
  final_mime_type = explicit_mime_type || detected_mime_type
  raise ArgumentError, "Could not determine image MIME type" if final_mime_type.nil?

  final_mime_type
end

.encode_base64(data) ⇒ String

Encodes binary image data to base64 string.

Examples:

encoded = VectorMCP::ImageUtil.encode_base64(File.binread("image.jpg"))

Parameters:

  • data (String)

    The binary image data.

Returns:

  • (String)

    Base64 encoded string.



120
121
122
# File 'lib/vector_mcp/image_util.rb', line 120

def encode_base64(data)
  Base64.strict_encode64(data)
end

.extract_dimensions(data, mime_type) ⇒ Hash

Extracts basic image dimensions for common formats. This is a simplified implementation; for production use, consider using a proper image library like MiniMagick or ImageMagick.

Parameters:

  • data (String)

    Binary image data.

  • mime_type (String)

    Detected MIME type.

Returns:

  • (Hash)

    Hash containing width/height if detectable.



320
321
322
323
324
325
326
327
328
329
330
331
332
333
# File 'lib/vector_mcp/image_util.rb', line 320

def extract_dimensions(data, mime_type)
  case mime_type
  when "image/png"
    extract_png_dimensions(data)
  when "image/jpeg"
    extract_jpeg_dimensions(data)
  when "image/gif"
    extract_gif_dimensions(data)
  else
    {}
  end
rescue StandardError
  {} # Return empty hash if dimension extraction fails
end

.extract_gif_dimensions(data) ⇒ Object

Extracts GIF dimensions from header.



371
372
373
374
375
376
377
378
379
# File 'lib/vector_mcp/image_util.rb', line 371

def extract_gif_dimensions(data)
  return {} unless data.length > 10

  # GIF dimensions are at bytes 6-9
  width = data[6, 2].unpack1("v")  # Little-endian
  height = data[8, 2].unpack1("v") # Little-endian

  { width: width, height: height }
end

.extract_jpeg_dimensions(data) ⇒ Object

Extracts JPEG dimensions from SOF marker.



349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
# File 'lib/vector_mcp/image_util.rb', line 349

def extract_jpeg_dimensions(data)
  # Simple JPEG dimension extraction
  # Look for SOF0 (Start of Frame) marker
  offset = 2
  while offset < data.length - 8
    marker = data[offset, 2].unpack1("n")
    length = data[offset + 2, 2].unpack1("n")

    # SOF0 marker (0xFFC0)
    if marker == 0xFFC0
      height = data[offset + 5, 2].unpack1("n")
      width = data[offset + 7, 2].unpack1("n")
      return { width: width, height: height }
    end

    offset += 2 + length
  end

  {}
end

.extract_metadata(data) ⇒ Hash

Extracts image metadata from binary data.

Examples:

 = VectorMCP::ImageUtil.(image_data)
# => { mime_type: "image/jpeg", size: 102400, format: "JPEG" }

Parameters:

  • data (String)

    Binary image data.

Returns:

  • (Hash)

    Metadata hash with available information.



267
268
269
270
271
272
273
274
275
276
277
278
279
280
# File 'lib/vector_mcp/image_util.rb', line 267

def (data)
  return {} if data.nil? || data.empty?

  mime_type = detect_image_format(data)
   = {
    size: data.bytesize,
    mime_type: mime_type
  }

  [:format] = mime_type.split("/").last.upcase if mime_type

  # Add basic dimension detection for common formats
  .merge!(extract_dimensions(data, mime_type))
end

.extract_png_dimensions(data) ⇒ Object

Extracts PNG dimensions from IHDR chunk.



338
339
340
341
342
343
344
345
346
# File 'lib/vector_mcp/image_util.rb', line 338

def extract_png_dimensions(data)
  return {} unless data.length > 24

  # PNG IHDR chunk starts at byte 16 and contains width/height
  width = data[16, 4].unpack1("N")
  height = data[20, 4].unpack1("N")

  { width: width, height: height }
end

.file_to_mcp_image_content(file_path, validate: true, max_size: DEFAULT_MAX_SIZE, base_directory: nil) ⇒ Hash

Converts file path to MCP-compliant image content.

Examples:

content = VectorMCP::ImageUtil.file_to_mcp_image_content("./avatar.png")

With path traversal protection

content = VectorMCP::ImageUtil.file_to_mcp_image_content(
  user_input_path,
  base_directory: "/app/uploads"
)

Parameters:

  • file_path (String)

    Path to the image file.

  • validate (Boolean) (defaults to: true)

    Whether to validate the image.

  • max_size (Integer) (defaults to: DEFAULT_MAX_SIZE)

    Maximum allowed size for validation.

  • base_directory (String, nil) (defaults to: nil)

    Optional base directory for path traversal protection. When provided, the resolved file_path must reside within this directory.

Returns:

  • (Hash)

    MCP image content hash.

Raises:

  • (ArgumentError)

    If file doesn’t exist, validation fails, or path traversal is detected.



233
234
235
236
237
238
239
240
241
242
# File 'lib/vector_mcp/image_util.rb', line 233

def file_to_mcp_image_content(file_path, validate: true, max_size: DEFAULT_MAX_SIZE, base_directory: nil)
  validate_path_safety!(file_path, base_directory) if base_directory

  raise ArgumentError, "Image file not found: #{file_path}" unless File.exist?(file_path)

  raise ArgumentError, "Image file not readable: #{file_path}" unless File.readable?(file_path)

  binary_data = File.binread(file_path)
  to_mcp_image_content(binary_data, validate: validate, max_size: max_size)
end

.process_image_data(data) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Processes input data to extract both binary and base64 representations.



173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
# File 'lib/vector_mcp/image_util.rb', line 173

def process_image_data(data)
  is_base64 = base64_string?(data)

  if is_base64
    # Decode to validate and detect format
    begin
      binary_data = decode_base64(data)
      base64_data = data
    rescue ArgumentError => e
      raise ArgumentError, "Invalid base64 image data: #{e.message}"
    end
  else
    # Assume binary data (dup to avoid modifying frozen strings)
    binary_data = data.dup.force_encoding(Encoding::ASCII_8BIT)
    base64_data = encode_base64(binary_data)
  end

  [binary_data, base64_data]
end

.to_mcp_image_content(data, mime_type: nil, validate: true, max_size: DEFAULT_MAX_SIZE) ⇒ Hash

Converts image data to MCP-compliant image content format.

Examples:

Convert binary image data

content = VectorMCP::ImageUtil.to_mcp_image_content(
  File.binread("image.jpg")
)
# => { type: "image", data: "base64...", mimeType: "image/jpeg" }

Convert base64 string with explicit MIME type

content = VectorMCP::ImageUtil.to_mcp_image_content(
  base64_string,
  mime_type: "image/png",
  validate: false
)

Parameters:

  • data (String)

    Binary image data or base64 encoded string.

  • mime_type (String, nil) (defaults to: nil)

    MIME type (auto-detected if nil).

  • validate (Boolean) (defaults to: true)

    Whether to validate the image data.

  • max_size (Integer) (defaults to: DEFAULT_MAX_SIZE)

    Maximum allowed size for validation.

Returns:

  • (Hash)

    MCP image content hash with :type, :data, and :mimeType.

Raises:

  • (ArgumentError)

    If validation fails.



159
160
161
162
163
164
165
166
167
168
169
# File 'lib/vector_mcp/image_util.rb', line 159

def to_mcp_image_content(data, mime_type: nil, validate: true, max_size: DEFAULT_MAX_SIZE)
  binary_data, base64_data = process_image_data(data)
  detected_mime_type = validate_and_detect_format(binary_data, validate, max_size)
  final_mime_type = determine_final_mime_type(mime_type, detected_mime_type)

  {
    type: "image",
    data: base64_data,
    mimeType: final_mime_type
  }
end

.validate_and_detect_format(binary_data, validate, max_size) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Validates image data and detects MIME type if validation is enabled.



195
196
197
198
199
200
201
202
203
204
# File 'lib/vector_mcp/image_util.rb', line 195

def validate_and_detect_format(binary_data, validate, max_size)
  if validate
    validation = validate_image(binary_data, max_size: max_size)
    raise ArgumentError, "Image validation failed: #{validation[:errors].join(", ")}" unless validation[:valid]

    validation[:mime_type]
  else
    detect_image_format(binary_data)
  end
end

.validate_image(data, max_size: DEFAULT_MAX_SIZE, allowed_formats: nil) ⇒ Hash

Validates if the provided data is a valid image.

Examples:

result = VectorMCP::ImageUtil.validate_image(image_data)
if result[:valid]
  puts "Valid #{result[:mime_type]} image"
else
  puts "Errors: #{result[:errors].join(', ')}"
end

Parameters:

  • data (String)

    The binary image data.

  • max_size (Integer) (defaults to: DEFAULT_MAX_SIZE)

    Maximum allowed size in bytes.

  • allowed_formats (Array<String>) (defaults to: nil)

    Allowed MIME types.

Returns:

  • (Hash)

    Validation result with :valid, :mime_type, and :errors keys.



82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# File 'lib/vector_mcp/image_util.rb', line 82

def validate_image(data, max_size: DEFAULT_MAX_SIZE, allowed_formats: nil)
  errors = []

  if data.nil? || data.empty?
    errors << "Image data is empty"
    return { valid: false, mime_type: nil, errors: errors }
  end

  # Check file size
  errors << "Image size (#{data.bytesize} bytes) exceeds maximum allowed size (#{max_size} bytes)" if data.bytesize > max_size

  # Detect format
  mime_type = detect_image_format(data)
  if mime_type.nil?
    errors << "Unrecognized or invalid image format"
    return { valid: false, mime_type: nil, errors: errors }
  end

  # Check allowed formats
  if allowed_formats && !allowed_formats.include?(mime_type)
    errors << "Image format #{mime_type} is not allowed. Allowed formats: #{allowed_formats.join(", ")}"
  end

  {
    valid: errors.empty?,
    mime_type: mime_type,
    size: data.bytesize,
    errors: errors
  }
end

.validate_path_safety!(file_path, base_directory) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Validates that a file path does not escape the given base directory.

Parameters:

  • file_path (String)

    The file path to validate.

  • base_directory (String)

    The base directory boundary.

Raises:

  • (ArgumentError)

    If the resolved path is outside base_directory.



250
251
252
253
254
255
256
257
# File 'lib/vector_mcp/image_util.rb', line 250

def validate_path_safety!(file_path, base_directory)
  resolved_base = File.expand_path(base_directory)
  resolved_path = File.expand_path(file_path, resolved_base)

  return if resolved_path.start_with?("#{resolved_base}/") || resolved_path == resolved_base

  raise ArgumentError, "Path traversal detected: resolved path is outside the allowed base directory"
end