Class: Propolize::TextBeingProcessed

Inherits:

Object

Object
Propolize::TextBeingProcessed

show all

Includes:: Helpers

Defined in:: lib/propolize.rb

Overview

A section of source text being processed as part of a document

Constant Summary collapse

@@plainTextParser = Plain text, any text not containing ‘', ’*‘, ’[‘ or ’&‘

[/\A[^\\\*\[&]+/m, :processPlainText]

@@backslashParser = Backslash item, ‘' followed by the quoted character

[/\A\\(.)/m, :processBackslash]

@@entityParser = An HTML entity, starts with ‘&’, then an alphanumerical identifier, or, ‘#’ + a number, followed by ‘;’

[/\A&(([A-Za-z0-9]+)|(#[0-9]+));/m, :processEntity]

@@doubleAsterixParser = A pair of asterisks

[/\A\*\*/m, :processDoubleAsterix]

@@singleAsterixParser = A single asterisk

[/\A\*/m, :processSingleAsterix]

@@linkOrAnchorParser = text enclosed by ‘[’ and ‘]’, with an optional following section enclosed by ‘(’ and ‘)’

[/\A\[([^\]]*)\](\(([^\)]+)\)|)/m, :processLinkOrAnchor]

@@linkTextParsers = Parsers to be applied inside link text (everything except the link/anchor parser)

[@@plainTextParser, @@backslashParser, @@entityParser, 
@@doubleAsterixParser, @@singleAsterixParser]

@@fullTextParsers = Parsers to be applied outside link text

@@linkTextParsers + [@@linkOrAnchorParser]

Instance Method Summary collapse

#checkValidAtEnd ⇒ Object

If the ‘*’ or ‘**’ values are not balanced, complain.
#initialize(document, text, writer, weAreInsideALink) ⇒ TextBeingProcessed constructor

Initialise - document - source document text - the actual text string writer - to which the output is written weAreInsideALink - are we inside a link? (if so, don’t attempt to parse any inner links).
#parse ⇒ Object

Parse the source text by repeatedly parsing however much is matched by the first parsing rule in the list of parsing rules that matches anything.
#processAnchor(url) ⇒ Object

Process an anchor definition which consists of either: 1.
#processBackslash(match) ⇒ Object

Process a backslash-quoted character by writing it out as HTML-escaped text.
#processDoubleAsterix(match) ⇒ Object

Process a double asterix by either starting or finishing an HTML bold section.
#processEntity(match) ⇒ Object

Process an HTML entity by writing it out as is.
#processLink(text, url) ⇒ Object

Process a link definition which consists of a URL definition followed by a text definition Special cases of a URL definition are 1.
#processLinkOrAnchor(match) ⇒ Object

Process a link consisting of [] and optional () section.
#processPlainText(match) ⇒ Object

Process plain text by writing out HTML-escaped text.
#processSingleAsterix(match) ⇒ Object

Process a single asterix by either starting or finishing an HTML italic section.
#textNotYetParsed ⇒ Object

Having parsed some of the text, how much is left to be parsed?.

Methods included from Helpers

#html_escape

Constructor Details

#initialize(document, text, writer, weAreInsideALink) ⇒ `TextBeingProcessed`

Initialise - document - source document text - the actual text string writer - to which the output is written weAreInsideALink - are we inside a link? (if so, don’t attempt to parse any inner links)

# File 'lib/propolize.rb', line 173

def initialize(document, text, writer, weAreInsideALink)
  @document = document
  @text = text
  @writer = writer
  @pos = 0
  @italic = false
  @bold = false
  # if we are inside a link (i.e. to be output as <a> tag), _don't_ attempt to parse any links within that link
  @parsers = if weAreInsideALink then @@linkTextParsers else @@fullTextParsers end
end

Instance Method Details

#checkValidAtEnd ⇒ `Object`

If the ‘*’ or ‘**’ values are not balanced, complain.

# File 'lib/propolize.rb', line 277

def checkValidAtEnd
  if @bold then
    raise DocumentError, "unclosed bold span"
  end
  if @italic then
    raise DocumentError, "unclosed italic span"
  end
end

#parse ⇒ `Object`

Parse the source text by repeatedly parsing however much is matched by the first parsing rule in the list of parsing rules that matches anything. Each time, the parsers are applied in order of priority, until a first match is found. This match uses up whatever amount of source text it matched. This is repeated until the source code is all used up.

# File 'lib/propolize.rb', line 297

def parse
  #puts "\nPARSING text #{@text.inspect} ..."
  while @pos < @text.length do
    #puts "  parsing remaining text #{textNotYetParsed.inspect} ..."
    match = nil
    i = 0
    textToParse = textNotYetParsed
    # Try the specified parsers in order of priority, stopping at the first match
    while i < @parsers.length and not match
      parser = @parsers[i]
      #puts "   trying #{parser[1]} ..."
      match = parser[0].match(textToParse)
      i += 1
    end
    if match then
      send(parser[1], match)
      fullMatchOffsets = match.offset(0)
      #puts " matched at #{fullMatchOffsets.inspect}, i.e. #{textToParse[fullMatchOffsets[0]...fullMatchOffsets[1]].inspect}"
      @pos += fullMatchOffsets[1]
    else
      raise Exception, "No match on #{textNotYetParsed.inspect}"
    end
  end
end

#processAnchor(url) ⇒ `Object`

Process an anchor definition which consists of either:

An normal anchor definition consisting of the anchor name followed by ‘:’, or,
A footnote, consisting of the footnote identifier followed by ‘::’ (the footnote identifier is also

the anchor name) This is output as the actual footnote number (assigned previously when a link to the footnote was given), and the HTML anchor.

# File 'lib/propolize.rb', line 249

def processAnchor(url)
  anchorMatch = /^([^\/:]*):$/.match(url)
  if anchorMatch
    @writer.write("<a name=\"#{anchorMatch[1]}\"></a>")
  else
    footnoteMatch = /^([^\/:]*)::$/.match(url)
    if footnoteMatch
      footnoteName = footnoteMatch[1]
      footnoteNumberString = @document.footnoteNumberFor(footnoteName)
      @writer.write("<span class=\"footnoteNumber\">#{footnoteNumberString}</span>" + 
                    "<a name=\"#{footnoteName}\"></a>")
    else
      raise DocumentError, "Invalid URL for anchor: #{url.inspect}"
    end
  end
end

#processBackslash(match) ⇒ `Object`

Process a backslash-quoted character by writing it out as HTML-escaped text



190
191
192

# File 'lib/propolize.rb', line 190

def processBackslash(match)
  @writer.write(html_escape(match[1]))
end

#processDoubleAsterix(match) ⇒ `Object`

Process a double asterix by either starting or finishing an HTML bold section.

# File 'lib/propolize.rb', line 200

def processDoubleAsterix(match)
  if @bold then
    @writer.write("</b>")
    @bold = false
  else
    @writer.write("<b>")
    @bold = true
  end
end

#processEntity(match) ⇒ `Object`

Process an HTML entity by writing it out as is



195
196
197

# File 'lib/propolize.rb', line 195

def processEntity(match)
  @writer.write(match[0])
end

#processLink(text, url) ⇒ `Object`

Process a link definition which consists of a URL definition followed by a text definition Special cases of a URL definition are

Footnote, represented by a unique footnote identifier followed by ‘::’
Anchor link, represented by the anchor name followed by ‘:’

The text definition is recursively parsed, except that link and anchor definitions cannot occur inside the text definition (or rather, they are just ignored).

# File 'lib/propolize.rb', line 227

def processLink (text, url)
  anchorMatch = /^([^\/:]*):$/.match(url)
  footnoteMatch = /^([^\/:]*)::$/.match(url)
  linkTextHtml = @document.processText(text, :weAreInsideALink => true)
  if footnoteMatch then 
    footnoteName = footnoteMatch[1]
    # The footnote has a name (i.e. unique identifier) in the source code, but the footnotes
    # are assigned sequential numbers in the output text.
    footnoteNumber = @document.getNewFootnoteNumberFor(footnoteName)
    @writer.write("<a href=\"##{footnoteName}\" class=\"footnote\">#{footnoteNumber}</a>")
  elsif anchorMatch then
    @writer.write("<a href=\"##{anchorMatch[1]}\">#{linkTextHtml}</a>")
  else
    @writer.write("<a href=\"#{url}\">#{linkTextHtml}</a>")
  end
end

#processLinkOrAnchor(match) ⇒ `Object`

Process a link consisting of [] and optional () section. If the () section is not given, then it is an HTML anchor definition (<a name>), otherwise it represents an HTML link (<a href>).

# File 'lib/propolize.rb', line 268

def processLinkOrAnchor(match)
  if match[3] then
    processLink(match[1], match[3])
  else
    processAnchor(match[1])
  end
end

#processPlainText(match) ⇒ `Object`

Process plain text by writing out HTML-escaped text



185
186
187

# File 'lib/propolize.rb', line 185

def processPlainText(match)
  @writer.write(html_escape(match[0]))
end

#processSingleAsterix(match) ⇒ `Object`

Process a single asterix by either starting or finishing an HTML italic section.

# File 'lib/propolize.rb', line 211

def processSingleAsterix(match)
  if @italic then
    @writer.write("</i>")
    @italic = false
  else
    @writer.write("<i>")
    @italic = true
  end
end

#textNotYetParsed ⇒ `Object`

Having parsed some of the text, how much is left to be parsed?



287
288
289

# File 'lib/propolize.rb', line 287

def textNotYetParsed
  return @text[@pos..-1]
end

Class: Propolize::TextBeingProcessed

Overview

Constant Summary collapse

Instance Method Summary collapse

Methods included from Helpers

Constructor Details

#initialize(document, text, writer, weAreInsideALink) ⇒ TextBeingProcessed

Instance Method Details

#checkValidAtEnd ⇒ Object

#parse ⇒ Object

#processAnchor(url) ⇒ Object

#processBackslash(match) ⇒ Object

#processDoubleAsterix(match) ⇒ Object

#processEntity(match) ⇒ Object

#processLink(text, url) ⇒ Object

#processLinkOrAnchor(match) ⇒ Object

#processPlainText(match) ⇒ Object

#processSingleAsterix(match) ⇒ Object

#textNotYetParsed ⇒ Object