Class: Propolize::TextBeingProcessed

Inherits:
Object
  • Object
show all
Includes:
Helpers
Defined in:
lib/propolize.rb

Overview

A section of source text being processed as part of a document

Constant Summary collapse

@@plainTextParser =

Plain text, any text not containing ‘', ’*‘, ’[‘ or ’&‘

[/\A[^\\\*\[&]+/m, :processPlainText]
@@backslashParser =

Backslash item, ‘' followed by the quoted character

[/\A\\(.)/m, :processBackslash]
@@entityParser =

An HTML entity, starts with ‘&’, then an alphanumerical identifier, or, ‘#’ + a number, followed by ‘;’

[/\A&(([A-Za-z0-9]+)|(#[0-9]+));/m, :processEntity]
@@doubleAsterixParser =

A pair of asterisks

[/\A\*\*/m, :processDoubleAsterix]
@@singleAsterixParser =

A single asterisk

[/\A\*/m, :processSingleAsterix]
@@linkOrAnchorParser =

text enclosed by ‘[’ and ‘]’, with an optional following section enclosed by ‘(’ and ‘)’

[/\A\[([^\]]*)\](\(([^\)]+)\)|)/m, :processLinkOrAnchor]
@@linkTextParsers =

Parsers to be applied inside link text (everything except the link/anchor parser)

[@@plainTextParser, @@backslashParser, @@entityParser, 
@@doubleAsterixParser, @@singleAsterixParser]
@@fullTextParsers =

Parsers to be applied outside link text

@@linkTextParsers + [@@linkOrAnchorParser]

Instance Method Summary collapse

Methods included from Helpers

#html_escape

Constructor Details

#initialize(document, text, writer, weAreInsideALink) ⇒ TextBeingProcessed

Initialise - document - source document text - the actual text string writer - to which the output is written weAreInsideALink - are we inside a link? (if so, don’t attempt to parse any inner links)



173
174
175
176
177
178
179
180
181
182
# File 'lib/propolize.rb', line 173

def initialize(document, text, writer, weAreInsideALink)
  @document = document
  @text = text
  @writer = writer
  @pos = 0
  @italic = false
  @bold = false
  # if we are inside a link (i.e. to be output as <a> tag), _don't_ attempt to parse any links within that link
  @parsers = if weAreInsideALink then @@linkTextParsers else @@fullTextParsers end
end

Instance Method Details

#checkValidAtEndObject

If the ‘*’ or ‘**’ values are not balanced, complain.



277
278
279
280
281
282
283
284
# File 'lib/propolize.rb', line 277

def checkValidAtEnd
  if @bold then
    raise DocumentError, "unclosed bold span"
  end
  if @italic then
    raise DocumentError, "unclosed italic span"
  end
end

#parseObject

Parse the source text by repeatedly parsing however much is matched by the first parsing rule in the list of parsing rules that matches anything. Each time, the parsers are applied in order of priority, until a first match is found. This match uses up whatever amount of source text it matched. This is repeated until the source code is all used up.



297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
# File 'lib/propolize.rb', line 297

def parse
  #puts "\nPARSING text #{@text.inspect} ..."
  while @pos < @text.length do
    #puts "  parsing remaining text #{textNotYetParsed.inspect} ..."
    match = nil
    i = 0
    textToParse = textNotYetParsed
    # Try the specified parsers in order of priority, stopping at the first match
    while i < @parsers.length and not match
      parser = @parsers[i]
      #puts "   trying #{parser[1]} ..."
      match = parser[0].match(textToParse)
      i += 1
    end
    if match then
      send(parser[1], match)
      fullMatchOffsets = match.offset(0)
      #puts " matched at #{fullMatchOffsets.inspect}, i.e. #{textToParse[fullMatchOffsets[0]...fullMatchOffsets[1]].inspect}"
      @pos += fullMatchOffsets[1]
    else
      raise Exception, "No match on #{textNotYetParsed.inspect}"
    end
  end
end

#processAnchor(url) ⇒ Object

Process an anchor definition which consists of either:

  1. An normal anchor definition consisting of the anchor name followed by ‘:’, or,

  2. A footnote, consisting of the footnote identifier followed by ‘::’ (the footnote identifier is also

the anchor name) This is output as the actual footnote number (assigned previously when a link to the footnote was given), and the HTML anchor.



249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
# File 'lib/propolize.rb', line 249

def processAnchor(url)
  anchorMatch = /^([^\/:]*):$/.match(url)
  if anchorMatch
    @writer.write("<a name=\"#{anchorMatch[1]}\"></a>")
  else
    footnoteMatch = /^([^\/:]*)::$/.match(url)
    if footnoteMatch
      footnoteName = footnoteMatch[1]
      footnoteNumberString = @document.footnoteNumberFor(footnoteName)
      @writer.write("<span class=\"footnoteNumber\">#{footnoteNumberString}</span>" + 
                    "<a name=\"#{footnoteName}\"></a>")
    else
      raise DocumentError, "Invalid URL for anchor: #{url.inspect}"
    end
  end
end

#processBackslash(match) ⇒ Object

Process a backslash-quoted character by writing it out as HTML-escaped text



190
191
192
# File 'lib/propolize.rb', line 190

def processBackslash(match)
  @writer.write(html_escape(match[1]))
end

#processDoubleAsterix(match) ⇒ Object

Process a double asterix by either starting or finishing an HTML bold section.



200
201
202
203
204
205
206
207
208
# File 'lib/propolize.rb', line 200

def processDoubleAsterix(match)
  if @bold then
    @writer.write("</b>")
    @bold = false
  else
    @writer.write("<b>")
    @bold = true
  end
end

#processEntity(match) ⇒ Object

Process an HTML entity by writing it out as is



195
196
197
# File 'lib/propolize.rb', line 195

def processEntity(match)
  @writer.write(match[0])
end

Process a link definition which consists of a URL definition followed by a text definition Special cases of a URL definition are

  1. Footnote, represented by a unique footnote identifier followed by ‘::’

  2. Anchor link, represented by the anchor name followed by ‘:’

The text definition is recursively parsed, except that link and anchor definitions cannot occur inside the text definition (or rather, they are just ignored).



227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
# File 'lib/propolize.rb', line 227

def processLink (text, url)
  anchorMatch = /^([^\/:]*):$/.match(url)
  footnoteMatch = /^([^\/:]*)::$/.match(url)
  linkTextHtml = @document.processText(text, :weAreInsideALink => true)
  if footnoteMatch then 
    footnoteName = footnoteMatch[1]
    # The footnote has a name (i.e. unique identifier) in the source code, but the footnotes
    # are assigned sequential numbers in the output text.
    footnoteNumber = @document.getNewFootnoteNumberFor(footnoteName)
    @writer.write("<a href=\"##{footnoteName}\" class=\"footnote\">#{footnoteNumber}</a>")
  elsif anchorMatch then
    @writer.write("<a href=\"##{anchorMatch[1]}\">#{linkTextHtml}</a>")
  else
    @writer.write("<a href=\"#{url}\">#{linkTextHtml}</a>")
  end
end

#processLinkOrAnchor(match) ⇒ Object

Process a link consisting of [] and optional () section. If the () section is not given, then it is an HTML anchor definition (<a name>), otherwise it represents an HTML link (<a href>).



268
269
270
271
272
273
274
# File 'lib/propolize.rb', line 268

def processLinkOrAnchor(match)
  if match[3] then
    processLink(match[1], match[3])
  else
    processAnchor(match[1])
  end
end

#processPlainText(match) ⇒ Object

Process plain text by writing out HTML-escaped text



185
186
187
# File 'lib/propolize.rb', line 185

def processPlainText(match)
  @writer.write(html_escape(match[0]))
end

#processSingleAsterix(match) ⇒ Object

Process a single asterix by either starting or finishing an HTML italic section.



211
212
213
214
215
216
217
218
219
# File 'lib/propolize.rb', line 211

def processSingleAsterix(match)
  if @italic then
    @writer.write("</i>")
    @italic = false
  else
    @writer.write("<i>")
    @italic = true
  end
end

#textNotYetParsedObject

Having parsed some of the text, how much is left to be parsed?



287
288
289
# File 'lib/propolize.rb', line 287

def textNotYetParsed
  return @text[@pos..-1]
end