Class: PEROBS::BTree

Inherits:
Object
  • Object
show all
Defined in:
lib/perobs/BTree.rb

Overview

This BTree class is very similar to a classic B+Tree implementation. It manages a tree that is always balanced. The BTree is stored in the specified directory and partially kept in memory to speed up operations. The order of the tree specifies how many keys each node will be able to hold. Leaf nodes will have a value associated with each key. Branch nodes have N + 1 references to child nodes instead.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(dir, name, order, progressmeter) ⇒ BTree

Create a new BTree object.

Parameters:

  • dir (String)

    Directory to store the tree file

  • name (String)

    Base name of the BTree related files in ‘dir’

  • order (Integer)

    The maximum number of keys per node. This number must be odd and larger than 2 and smaller than 2**16 - 1.

  • progressmeter (ProgressMeter)

    reference to a ProgressMeter object



51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# File 'lib/perobs/BTree.rb', line 51

def initialize(dir, name, order, progressmeter)
  @dir = dir
  @name = name
  @progressmeter = progressmeter

  unless order > 4
    PEROBS.log.fatal "BTree order must be larger than 4, not #{order}"
  end
  unless order % 2 == 1
    PEROBS.log.fatal "BTree order must be an uneven number, not #{order}"
  end
  unless order < 2 ** 16 - 1
    PEROBS.log.fatal "BTree order must be smaller than #{2**16 - 1}"
  end
  @order = order

  # This EquiBlobsFile contains the nodes of the BTree.
  @nodes = EquiBlobsFile.new(@dir, @name, @progressmeter,
                             BTreeNode::node_bytes(@order))
  @nodes.register_custom_data('first_leaf')
  @nodes.register_custom_data('last_leaf')
  @nodes.register_custom_data('btree_size')
  @node_cache = PersistentObjectCache.new(2**16, -1, BTreeNode, self)
  @root = @first_leaf = @last_leaf = nil
  @size = 0

  # This BTree implementation uses a write cache to improve write
  # performance of multiple successive read/write operations. This also
  # means that data may not be written on the backing store until the
  # sync() or close() methods have been called. A bug in the program or a
  # premature program termination can lead to data loss. To detect such
  # situations, we use a lock file whenever there are pending writes.
  @is_dirty = false
  @dirty_flag = LockFile.new(File.join(@dir, name + '.dirty'),
                             { :timeout_secs => 0 })
end

Instance Attribute Details

#first_leafObject (readonly)

Returns the value of attribute first_leaf.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def first_leaf
  @first_leaf
end

#last_leafObject (readonly)

Returns the value of attribute last_leaf.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def last_leaf
  @last_leaf
end

#node_cacheObject (readonly)

Returns the value of attribute node_cache.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def node_cache
  @node_cache
end

#nodesObject (readonly)

Returns the value of attribute nodes.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def nodes
  @nodes
end

#orderObject (readonly)

Returns the value of attribute order.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def order
  @order
end

#sizeObject (readonly)

Returns the value of attribute size.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def size
  @size
end

Instance Method Details

#check(&block) ⇒ Boolean

Check if the tree file contains any errors.

Returns:

  • (Boolean)

    true if no erros were found, false otherwise



161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
# File 'lib/perobs/BTree.rb', line 161

def check(&block)
  sync
  return false unless @nodes.check

  entries = 0
  stats = nil
  @progressmeter.start('Checking index structure', @size) do |pm|
    stats = @root.check do |k, v|
      pm.update(entries += 1)
      block_given? ? yield(k, v) : true
    end
  end

  return false unless stats

  unless entries == @size
    PEROBS.log.error "The BTree size (#{@size}) and the number of " +
      "found entries (#{entries}) don't match"
    return false
  end
  unless stats.nodes_count == @nodes.total_entries
    PEROBS.log.error "The BTree nodes count (#{stats.nodes_count}) and " +
      "the number of entries in the nodes file (#{@nodes.total_entries}) " +
      "don't match"
      return false
  end
  PEROBS.log.info "Statistics for the BTree #{@name}: " +
    "Number of nodes: #{stats.nodes_count}; " +
    "Branch depth: #{stats.branch_depth}; " +
    "Number of leave nodes: #{stats.leave_nodes}; " +
    "Number of leaves: #{stats.leaves}"

  !stats.nil?
end

#clearObject

Clear all pools and forget any registered spaces.



134
135
136
137
138
139
# File 'lib/perobs/BTree.rb', line 134

def clear
  @node_cache.clear
  @nodes.clear
  @size = 0
  set_root(BTreeNode::create(self))
end

#closeObject

Close the tree file.



119
120
121
122
123
124
125
126
# File 'lib/perobs/BTree.rb', line 119

def close
  sync
  PEROBS.log.info "BTree file #{@name} has currently " +
    "#{@nodes.total_entries} used entries and #{@nodes.total_spaces} " +
    "unused entries"
  @nodes.close
  @root = nil
end

#delete_node(address) ⇒ Object

Delete the node at the given address in the BTree file.

Parameters:

  • address (Integer)

    address in file



271
272
273
274
# File 'lib/perobs/BTree.rb', line 271

def delete_node(address)
  @node_cache.delete(address)
  @nodes.delete_blob(address)
end

#each {|key, value| ... } ⇒ Object

Iterate over all key/value pairs that are stored in the tree.

Yields:

  • (key, value)


265
266
267
# File 'lib/perobs/BTree.rb', line 265

def each(&block)
  @root.each(&block)
end

#entries_countInteger

Returns The number of entries stored in the tree.

Returns:

  • (Integer)

    The number of entries stored in the tree.



277
278
279
# File 'lib/perobs/BTree.rb', line 277

def entries_count
  @size
end

#eraseObject

Erase the backing store of the BTree. This method should only be called when not having the BTree open. And it obviously and permanently erases all stored data from the BTree.



144
145
146
147
148
149
# File 'lib/perobs/BTree.rb', line 144

def erase
  @nodes.erase
  @size = 0
  @root = nil
  @dirty_flag.forced_unlock
end

#get(key) ⇒ Integer or nil

Retrieve the value associated with the given key. If no entry was found, return nil.

Parameters:

  • key (Integer)

    Unique key

Returns:

  • (Integer or nil)

    found value or nil



232
233
234
# File 'lib/perobs/BTree.rb', line 232

def get(key)
  @root.get(key)
end

#get_best_match(key, min_miss_increment) ⇒ Object

Either return the key/value pair that exactly matches the key or a key/value pair that has a key that is at least min_miss_increment larger than the key.



239
240
241
# File 'lib/perobs/BTree.rb', line 239

def get_best_match(key, min_miss_increment)
  @root.get_best_match(key, min_miss_increment)
end

#insert(key, value) ⇒ Object

Insert a new value into the tree using the key as a unique index. If the key already exists the old value will be overwritten.

Parameters:

  • key (Integer)

    Unique key

  • value (Integer)

    value



221
222
223
224
225
226
# File 'lib/perobs/BTree.rb', line 221

def insert(key, value)
  if @root.insert(key, value)
    @size += 1
    @node_cache.flush
  end
end

#is_open?Boolean

Returns true if file is currently open.

Returns:

  • (Boolean)

    true if file is currently open



129
130
131
# File 'lib/perobs/BTree.rb', line 129

def is_open?
  !@root.nil?
end

#open(file_must_exist = false) ⇒ Object

Open the tree file.



89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
# File 'lib/perobs/BTree.rb', line 89

def open(file_must_exist = false)
  if @dirty_flag.is_locked?
    PEROBS.log.fatal "Index file #{@nodes.file_name} is already " +
      "locked"
  end
  if file_must_exist && !@nodes.file_exist?
    PEROBS.log.fatal "Index file #{@nodes.file_name} does not exist"
  end

  @node_cache.clear
  @nodes.open

  if @nodes.total_entries == 0
    # We've created a new nodes file
    node = BTreeNode::create(self)
  else
    # We are loading an existing tree.
    node = BTreeNode::load_and_link(self, @nodes.first_entry)
    @first_leaf = BTreeNode::load_and_link(
      self, @nodes.get_custom_data('first_leaf'))
    @last_leaf = BTreeNode::load_and_link(
      self, @nodes.get_custom_data('last_leaf'))
  end
  set_root(node)

  # Get the total number of entries that are stored in the tree.
  @size = @nodes.get_custom_data('btree_size')
end

#remove(key) ⇒ Object

Find and remove the value associated with the given key. If no entry was found, return nil, otherwise the found value.



245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
# File 'lib/perobs/BTree.rb', line 245

def remove(key)
  @size -= 1 unless (removed_value = @root.remove(key)).nil?

  # Check if the root node only contains one child link after the delete
  # operation. Then we can delete that node and pull the tree one level
  # up. This could happen for a sequence of nodes that all got merged to
  # single child nodes.
  while !@root.is_leaf && @root.children.size == 1
    old_root = @root
    set_root(@root.children.first)
    @root.parent = nil
    delete_node(old_root.node_address)
  end

  @node_cache.flush
  removed_value
end

#set_first_leaf(node) ⇒ Object

Set the address of the first leaf node.

Parameters:



205
206
207
208
# File 'lib/perobs/BTree.rb', line 205

def set_first_leaf(node)
  @first_leaf = node
  @nodes.set_custom_data('first_leaf', node.node_address)
end

#set_last_leaf(node) ⇒ Object

Set the address of the last leaf node.

Parameters:



212
213
214
215
# File 'lib/perobs/BTree.rb', line 212

def set_last_leaf(node)
  @last_leaf = node
  @nodes.set_custom_data('last_leaf', node.node_address)
end

#set_root(node) ⇒ Object

Register a new node as root node of the tree.

Parameters:



198
199
200
201
# File 'lib/perobs/BTree.rb', line 198

def set_root(node)
  @root = node
  @nodes.first_entry = node.node_address
end

#syncObject

Flush all pending modifications into the tree file.



152
153
154
155
156
157
# File 'lib/perobs/BTree.rb', line 152

def sync
  @node_cache.flush(true)
  @nodes.set_custom_data('btree_size', @size)
  @nodes.sync
  @dirty_flag.unlock if @dirty_flag.is_locked?
end

#to_sString

Returns Human reable form of the tree.

Returns:

  • (String)

    Human reable form of the tree.



282
283
284
# File 'lib/perobs/BTree.rb', line 282

def to_s
  @root.to_s
end