Class: TaskJuggler::TextParser::Pattern

Inherits:
Object
  • Object
show all
Defined in:
lib/taskjuggler/TextParser/Pattern.rb

Overview

This class models the most crutial elements of a syntax description - the pattern. A TextParserPattern primarily consists of a set of tokens. Tokens are Strings where the first character determines the type of the token. There are 4 known types.

Terminal token: In the syntax declaration the terminal token is prefixed by an underscore. Terminal tokens are terminal symbols of the syntax tree. They just represent themselves.

Variable token: The variable token describes values of a certain class such as strings or numbers. In the syntax declaration the token is prefixed by a dollar sign and the text of the token specifies the variable type. See ProjectFileParser for a complete list of variable types.

Reference token: The reference token specifies a reference to another parser rule. In the syntax declaration the token is prefixed by a bang and the text matches the name of the rule. See TextParserRule for details.

End token: The . token marks the expected end of the input stream.

In addition to the pure syntax tree information the pattern also holds documentary information about the pattern.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(tokens, function = nil) ⇒ Pattern

Create a new Pattern object. tokens must be an Array of String objects that describe the Pattern. function can be a reference to a method that should be called when the Pattern was recognized by the parser.



49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 49

def initialize(tokens, function = nil)
  # A unique name for the pattern that is used in the documentation.
  @keyword = nil
  # Initialize pattern doc as empty.
  @doc = nil
  # A list of TokenDoc elements that describe the meaning of variable
  # tokens. The order of the tokens and entries in the Array must correlate.
  @args = []
  # The syntax can evolve over time. The support level specifies which
  # level of support this pattern hast. Possible values are :experimental,
  # :beta, :supported, :deprecated, :removed
  @supportLevel = :supported
  # A list of references to other patterns that are related to this pattern.
  @seeAlso = []
  # A reference to a file under test/TestSuite/Syntax/Correct and a tag
  # within that file. This identifies example TJP code to be included with
  # the reference manual.
  @exampleFile = nil
  @exampleTag = nil

  @tokens = []
  tokens.each do |token|
    unless '!$_.'.include?(token[0])
      raise "Fatal Error: All pattern tokens must start with a type " +
            "identifier [!$_.]: #{tokens.join(', ')}"
    end
    # For the syntax specification using a prefix character is more
    # convenient. But for further processing, we need to split the string
    # into two symbols. The prefix determines the token type, the rest is
    # the token name. There are 4 types of tokens:
    # :reference : a reference to another rule
    # :variable : a terminal symbol
    # :literal : a user defined string
    # :eof : marks the end of an input stream
    type = [ :reference, :variable, :literal, :eof ]['!$_.'.index(token[0])]
    # For literals we use a String to store the token content. For others,
    # a symbol is better suited.
    name = type == :literal ?
           token[1..-1] : (type == :eof ? '<END>' : token[1..-1].intern)
    # We favor an Array to store the 2 elements over a Hash for
    # performance reasons.
    @tokens << [ type, name ]
    # Initialize pattern argument descriptions as empty.
    @args << nil
  end
  @function = function
  # In some cases we don't want to show all tokens in the syntax
  # documentation. This value specifies the index of the last shown token.
  @lastSyntaxToken = @tokens.length - 1

  @transitions = []
end

Instance Attribute Details

#docObject (readonly)

Returns the value of attribute doc.



43
44
45
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 43

def doc
  @doc
end

#exampleFileObject (readonly)

Returns the value of attribute exampleFile.



43
44
45
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 43

def exampleFile
  @exampleFile
end

#exampleTagObject (readonly)

Returns the value of attribute exampleTag.



43
44
45
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 43

def exampleTag
  @exampleTag
end

#functionObject (readonly)

Returns the value of attribute function.



43
44
45
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 43

def function
  @function
end

#keywordObject (readonly)

Returns the value of attribute keyword.



43
44
45
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 43

def keyword
  @keyword
end

#seeAlsoObject (readonly)

Returns the value of attribute seeAlso.



43
44
45
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 43

def seeAlso
  @seeAlso
end

#supportLevelObject (readonly)

Returns the value of attribute supportLevel.



43
44
45
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 43

def supportLevel
  @supportLevel
end

#tokensObject (readonly)

Returns the value of attribute tokens.



43
44
45
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 43

def tokens
  @tokens
end

Instance Method Details

#[](i) ⇒ Object

Conveniance function to access individual tokens by index.



246
247
248
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 246

def [](i)
  @tokens[i]
end

#addTransitionsToState(states, rules, stateStack, sourceState, destRule, destIndex, loopBack) ⇒ Object

Add the transitions to the State objects of this pattern. states is a Hash with all State objects. rules is a Hash with the Rule objects of the syntax. stateStack is an Array of State objects that have been traversed before reaching this pattern. sourceState is the State that the transition originates from. destRule, this pattern and destIndex describe the State the transition is leading to. loopBack is boolean flag, set to true when the transition describes a loop back to the start of the Rule.



141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 141

def addTransitionsToState(states, rules, stateStack, sourceState,
                          destRule, destIndex, loopBack)
  # If we hit a token in the pattern that is optional, we need to consider
  # the next token of the pattern as well.
  loop do
    if destIndex >= @tokens.length
      if sourceState.rule == destRule
        if destRule.repeatable
          # The transition leads us back to the start of the Rule. This
          # will generate transitions to the first token of all patterns
          # of this Rule.
          destRule.addTransitionsToState(states, rules, [], sourceState,
                                         true)
        end
      end
      # We've reached the end of the pattern. No more transitions to
      # consider.
      return
    end

    # The token descriptor tells us where the transition(s) need to go to.
    tokenType, tokenName = @tokens[destIndex]

    case tokenType
    when :reference
      # The descriptor references another rule.
      unless (refRule = rules[tokenName])
        raise "Unknown rule #{tokenName} referenced in rule #{refRule.name}"
      end
      # If we reference another rule from a pattern, we need to come back
      # to the pattern once we are done with the referenced rule. To be
      # able to come back, we collect a list of all the States that we
      # have passed during a reference resolution. This list forms a stack
      # that is popped during recude operations of the parser FSM.
      skippedState = states[[ destRule, self, destIndex ]]
      # Rules may reference themselves directly or indirectly. To avoid
      # endless recursions of this algorithm, we stop once we have
      # detected a recursion. We have already all necessary transitions
      # collected. The recursion will be unrolled in the parser FSM.
      unless stateStack.include?(skippedState)
        # Push the skipped state on the stateStack before recursing.
        stateStack.push(skippedState)
        refRule.addTransitionsToState(states, rules, stateStack,
                                      sourceState, loopBack)
        # Once we're done, remove the State from the stateStack again.
        stateStack.pop
      end

      # If the referenced rule is not optional, we have no further
      # transitions for this pattern at this destIndex.
      break unless refRule.optional?(rules)
    else
      unless (destState = states[[ destRule, self, destIndex ]])
        raise "Destination state not found"
      end
      # We've found a transition to a terminal token. Add the transition
      # to the source State.
      sourceState.addTransition(@tokens[destIndex], destState, stateStack,
                                loopBack)
      # Fixed tokens are never optional. There are no more transitions for
      # this pattern at this index.
      break
    end

    destIndex += 1
  end
end

#eachObject

Iterator for tokens.



251
252
253
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 251

def each
  @tokens.each { |type, name| yield(type, name) }
end

#empty?Boolean

Returns true of the pattern is empty.

Returns:

  • (Boolean)


256
257
258
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 256

def empty?
  @tokens.empty?
end

#generateStates(rule, rules) ⇒ Object

Generate the state machine states for the pattern. rule is the Rule that the pattern belongs to. A list of generated State objects will be returned.



105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 105

def generateStates(rule, rules)
  # The last token of a pattern must always trigger a reduce operation.
  # But the the last tokens of a pattern describe fully optional syntax,
  # the last non-optional token and all following optional tokens must
  # trigger a reduce operation. Here we find the index of the first token
  # that must trigger a reduce operation.
  firstReduceableToken = @tokens.length - 1
  (@tokens.length - 2).downto(0).each do |i|
    if optionalToken(i + 1, rules)
      # If token i + 1 is optional, assume token i is the first one to
      # trigger a reduce.
      firstReduceableToken = i
    else
      # token i + 1 is not optional, we found the first token to trigger
      # the reduce.
      break
    end
  end

  states = []
  @tokens.length.times do |i|
    states << (state = State.new(rule, self, i))
    # Mark all states that are allowed to trigger a reduce operation.
    state.noReduce = false if i >= firstReduceableToken
  end
  states
end

#lengthObject

Returns the number of tokens in the pattern.



261
262
263
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 261

def length
  @tokens.length
end

#optional?(rules) ⇒ Boolean

Return true if all tokens of the pattern are optional. If a token references a rule, this rule is followed for the check.

Returns:

  • (Boolean)


267
268
269
270
271
272
273
274
275
276
277
278
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 267

def optional?(rules)
  @tokens.each do |type, name|
    if type == :literal || type == :variable
      return false
    elsif type == :reference
      if !rules[name].optional?(rules)
        return false
      end
    end
  end
  true
end

#setArg(idx, doc) ⇒ Object

Set the documentation text and for the idx-th variable.



216
217
218
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 216

def setArg(idx, doc)
  @args[idx] = doc
end

#setDoc(keyword, doc) ⇒ Object

Set the keyword and documentation text for the pattern.



210
211
212
213
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 210

def setDoc(keyword, doc)
  @keyword = keyword
  @doc = doc
end

#setExample(file, tag) ⇒ Object

Set the file and tag for the TJP code example.



240
241
242
243
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 240

def setExample(file, tag)
  @exampleFile = file
  @exampleTag = tag
end

#setLastSyntaxToken(idx) ⇒ Object

Restrict the syntax documentation to the first idx tokens.



221
222
223
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 221

def setLastSyntaxToken(idx)
  @lastSyntaxToken = idx
end

#setSeeAlso(also) ⇒ Object

Set the references to related patterns.



235
236
237
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 235

def setSeeAlso(also)
  @seeAlso = also
end

#setSupportLevel(level) ⇒ Object

Specify the support level of this pattern.



226
227
228
229
230
231
232
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 226

def setSupportLevel(level)
  unless [ :experimental, :beta, :supported, :deprecated,
           :removed ].include?(level)
    raise "Fatal Error: Unknown support level #{level}"
  end
  @supportLevel = level
end

#terminalSymbol?(i) ⇒ Boolean

Returns true if the i-th token is a terminal symbol.

Returns:

  • (Boolean)


281
282
283
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 281

def terminalSymbol?(i)
  @tokens[i][0] == :variable || @tokens[i][0] == :literal
end

#terminalTokens(rules, index = 0) ⇒ Object

Find recursively the first terminal token of this pattern. If an index is specified start the search at this n-th pattern token instead of the first. The return value is an Array of [ token, pattern ] tuple.



288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 288

def terminalTokens(rules, index = 0)
  type, name = @tokens[index]
  # Terminal token start with an underscore or dollar character.
  if type == :literal
    return [ [ name, self ] ]
  elsif type == :variable
    return []
  elsif type == :reference
    # We have to continue the search at this rule.
    rule = rules[name]
    # The rule may only have a single pattern. If not, then this pattern
    # has no terminal token.
    tts = []
    rule.patterns.each { |p| tts += p.terminalTokens(rules, 0) }
    return tts
  else
    raise "Unexpected token #{type} #{name}"
  end
end

#to_sObject

Generate a text form of the pattern. This is similar to the syntax in the original syntax description.



390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 390

def to_s
  str = ""
  @tokens.each do |type, name|
    case type
    when :reference
      str += "!#{name} "
    when :variable
      str += "$#{name } "
    when :literal
      str += "#{name} "
    when :eof
      str += ". "
    else
      raise "Unknown type #{type}"
    end
  end

  str
end

#to_syntax(argDocs, rules, skip = 0) ⇒ Object

Returns a string that expresses the elements of the pattern in an EBNF like fashion. The resolution of the pattern is done recursively. This is just the wrapper function that sets up the stack.



311
312
313
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 311

def to_syntax(argDocs, rules, skip = 0)
  to_syntax_r({}, argDocs, rules, skip)
end

#to_syntax_r(stack, argDocs, rules, skip) ⇒ Object

Generate a syntax description for this pattern.



316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
# File 'lib/taskjuggler/TextParser/Pattern.rb', line 316

def to_syntax_r(stack, argDocs, rules, skip)
  # If we find ourself on the stack we hit a recursive pattern. This is used
  # in repetitions.
  if stack[self]
    return '[, ... ]'
  end

  # "Push" us on the stack.
  stack[self] = true

  str = ''
  first = true
  # Analyze the tokens of the pattern skipping the first 'skip' tokens.
  skip.upto(@lastSyntaxToken) do |i|
    type, name = @tokens[i]
    # If the first token is a _{ the pattern describes optional attributes.
    # They are represented by a standard idiom.
    if first
      first = false
      return '{ <attributes> }' if name == '{'
    else
      # Separate the syntax elemens by a whitespace.
      str << ' '
    end

    if @args[i]
      # The argument is documented in the syntax definition. We copy the
      # entry as we need to modify it.
      argDoc = @args[i].dup

      # A documented argument without a name is a terminal token. We use the
      # terminal symbol as name.
      if @args[i].name.nil?
        str << "#{name}"
        argDoc.name = name
      else
        str << "<#{@args[i].name}>"
      end
      addArgDoc(argDocs, argDoc)

      # Documented arguments don't have the type set yet. Use the token
      # value for that.
      if type == :variable
        argDoc.typeSpec = "<#{name}>"
      end
    else
      # Undocumented tokens are recursively expanded.
      case type
      when :literal
        # Literals are shown as such.
        str << name.to_s
      when :variable
        # Variables are enclosed by angle brackets.
        str << "<#{name}>"
      when :reference
        if rules[name].patterns.length == 1 &&
           !rules[name].patterns[0].doc.nil?
          addArgDoc(argDocs, TokenDoc.new(rules[name].patterns[0].keyword,
                                          rules[name].patterns[0]))
          str << '<' + rules[name].patterns[0].keyword + '>'
        else
          # References are followed recursively.
          str << rules[name].to_syntax(stack, argDocs, rules, 0)
        end
      end
    end
  end
  # Remove us from the "stack" again.
  stack.delete(self)
  str
end