Class: Propolize::TextBeingProcessed
- Inherits:
-
Object
- Object
- Propolize::TextBeingProcessed
- Includes:
- Helpers
- Defined in:
- lib/propolize.rb
Overview
A section of source text being processed as part of a document
Constant Summary collapse
- @@plainTextParser =
Plain text, any text not containing ‘', ’*‘, ’[‘ or ’&‘
[/\A[^\\\*\[&]+/m, :processPlainText]
- @@backslashParser =
Backslash item, ‘' followed by the quoted character
[/\A\\(.)/m, :processBackslash]
- @@entityParser =
An HTML entity, starts with ‘&’, then an alphanumerical identifier, or, ‘#’ + a number, followed by ‘;’
[/\A&(([A-Za-z0-9]+)|(#[0-9]+));/m, :processEntity]
- @@doubleAsterixParser =
A pair of asterisks
[/\A\*\*/m, :processDoubleAsterix]
- @@singleAsterixParser =
A single asterisk
[/\A\*/m, :processSingleAsterix]
- @@linkOrAnchorParser =
text enclosed by ‘[’ and ‘]’, with an optional following section enclosed by ‘(’ and ‘)’
[/\A\[([^\]]*)\](\(([^\)]+)\)|)/m, :processLinkOrAnchor]
- @@linkTextParsers =
Parsers to be applied inside link text (everything except the link/anchor parser)
[@@plainTextParser, @@backslashParser, @@entityParser, @@doubleAsterixParser, @@singleAsterixParser]
- @@fullTextParsers =
Parsers to be applied outside link text
@@linkTextParsers + [@@linkOrAnchorParser]
Instance Method Summary collapse
-
#checkValidAtEnd ⇒ Object
If the ‘*’ or ‘**’ values are not balanced, complain.
-
#initialize(document, text, writer, weAreInsideALink) ⇒ TextBeingProcessed
constructor
Initialise - document - source document text - the actual text string writer - to which the output is written weAreInsideALink - are we inside a link? (if so, don’t attempt to parse any inner links).
-
#parse ⇒ Object
Parse the source text by repeatedly parsing however much is matched by the first parsing rule in the list of parsing rules that matches anything.
-
#processAnchor(url) ⇒ Object
Process an anchor definition which consists of either: 1.
-
#processBackslash(match) ⇒ Object
Process a backslash-quoted character by writing it out as HTML-escaped text.
-
#processDoubleAsterix(match) ⇒ Object
Process a double asterix by either starting or finishing an HTML bold section.
-
#processEntity(match) ⇒ Object
Process an HTML entity by writing it out as is.
-
#processLink(text, url) ⇒ Object
Process a link definition which consists of a URL definition followed by a text definition Special cases of a URL definition are 1.
-
#processLinkOrAnchor(match) ⇒ Object
Process a link consisting of [] and optional () section.
-
#processPlainText(match) ⇒ Object
Process plain text by writing out HTML-escaped text.
-
#processSingleAsterix(match) ⇒ Object
Process a single asterix by either starting or finishing an HTML italic section.
-
#textNotYetParsed ⇒ Object
Having parsed some of the text, how much is left to be parsed?.
Methods included from Helpers
Constructor Details
#initialize(document, text, writer, weAreInsideALink) ⇒ TextBeingProcessed
Initialise - document - source document text - the actual text string writer - to which the output is written weAreInsideALink - are we inside a link? (if so, don’t attempt to parse any inner links)
173 174 175 176 177 178 179 180 181 182 |
# File 'lib/propolize.rb', line 173 def initialize(document, text, writer, weAreInsideALink) @document = document @text = text @writer = writer @pos = 0 @italic = false @bold = false # if we are inside a link (i.e. to be output as <a> tag), _don't_ attempt to parse any links within that link @parsers = if weAreInsideALink then @@linkTextParsers else @@fullTextParsers end end |
Instance Method Details
#checkValidAtEnd ⇒ Object
If the ‘*’ or ‘**’ values are not balanced, complain.
277 278 279 280 281 282 283 284 |
# File 'lib/propolize.rb', line 277 def checkValidAtEnd if @bold then raise DocumentError, "unclosed bold span" end if @italic then raise DocumentError, "unclosed italic span" end end |
#parse ⇒ Object
Parse the source text by repeatedly parsing however much is matched by the first parsing rule in the list of parsing rules that matches anything. Each time, the parsers are applied in order of priority, until a first match is found. This match uses up whatever amount of source text it matched. This is repeated until the source code is all used up.
297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 |
# File 'lib/propolize.rb', line 297 def parse #puts "\nPARSING text #{@text.inspect} ..." while @pos < @text.length do #puts " parsing remaining text #{textNotYetParsed.inspect} ..." match = nil i = 0 textToParse = textNotYetParsed # Try the specified parsers in order of priority, stopping at the first match while i < @parsers.length and not match parser = @parsers[i] #puts " trying #{parser[1]} ..." match = parser[0].match(textToParse) i += 1 end if match then send(parser[1], match) fullMatchOffsets = match.offset(0) #puts " matched at #{fullMatchOffsets.inspect}, i.e. #{textToParse[fullMatchOffsets[0]...fullMatchOffsets[1]].inspect}" @pos += fullMatchOffsets[1] else raise Exception, "No match on #{textNotYetParsed.inspect}" end end end |
#processAnchor(url) ⇒ Object
Process an anchor definition which consists of either:
-
An normal anchor definition consisting of the anchor name followed by ‘:’, or,
-
A footnote, consisting of the footnote identifier followed by ‘::’ (the footnote identifier is also
the anchor name) This is output as the actual footnote number (assigned previously when a link to the footnote was given), and the HTML anchor.
249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 |
# File 'lib/propolize.rb', line 249 def processAnchor(url) anchorMatch = /^([^\/:]*):$/.match(url) if anchorMatch @writer.write("<a name=\"#{anchorMatch[1]}\"></a>") else footnoteMatch = /^([^\/:]*)::$/.match(url) if footnoteMatch footnoteName = footnoteMatch[1] footnoteNumberString = @document.footnoteNumberFor(footnoteName) @writer.write("<span class=\"footnoteNumber\">#{footnoteNumberString}</span>" + "<a name=\"#{footnoteName}\"></a>") else raise DocumentError, "Invalid URL for anchor: #{url.inspect}" end end end |
#processBackslash(match) ⇒ Object
Process a backslash-quoted character by writing it out as HTML-escaped text
190 191 192 |
# File 'lib/propolize.rb', line 190 def processBackslash(match) @writer.write(html_escape(match[1])) end |
#processDoubleAsterix(match) ⇒ Object
Process a double asterix by either starting or finishing an HTML bold section.
200 201 202 203 204 205 206 207 208 |
# File 'lib/propolize.rb', line 200 def processDoubleAsterix(match) if @bold then @writer.write("</b>") @bold = false else @writer.write("<b>") @bold = true end end |
#processEntity(match) ⇒ Object
Process an HTML entity by writing it out as is
195 196 197 |
# File 'lib/propolize.rb', line 195 def processEntity(match) @writer.write(match[0]) end |
#processLink(text, url) ⇒ Object
Process a link definition which consists of a URL definition followed by a text definition Special cases of a URL definition are
-
Footnote, represented by a unique footnote identifier followed by ‘::’
-
Anchor link, represented by the anchor name followed by ‘:’
The text definition is recursively parsed, except that link and anchor definitions cannot occur inside the text definition (or rather, they are just ignored).
227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
# File 'lib/propolize.rb', line 227 def processLink (text, url) anchorMatch = /^([^\/:]*):$/.match(url) footnoteMatch = /^([^\/:]*)::$/.match(url) linkTextHtml = @document.processText(text, :weAreInsideALink => true) if footnoteMatch then footnoteName = footnoteMatch[1] # The footnote has a name (i.e. unique identifier) in the source code, but the footnotes # are assigned sequential numbers in the output text. footnoteNumber = @document.getNewFootnoteNumberFor(footnoteName) @writer.write("<a href=\"##{footnoteName}\" class=\"footnote\">#{footnoteNumber}</a>") elsif anchorMatch then @writer.write("<a href=\"##{anchorMatch[1]}\">#{linkTextHtml}</a>") else @writer.write("<a href=\"#{url}\">#{linkTextHtml}</a>") end end |
#processLinkOrAnchor(match) ⇒ Object
Process a link consisting of [] and optional () section. If the () section is not given, then it is an HTML anchor definition (<a name>), otherwise it represents an HTML link (<a href>).
268 269 270 271 272 273 274 |
# File 'lib/propolize.rb', line 268 def processLinkOrAnchor(match) if match[3] then processLink(match[1], match[3]) else processAnchor(match[1]) end end |
#processPlainText(match) ⇒ Object
Process plain text by writing out HTML-escaped text
185 186 187 |
# File 'lib/propolize.rb', line 185 def processPlainText(match) @writer.write(html_escape(match[0])) end |
#processSingleAsterix(match) ⇒ Object
Process a single asterix by either starting or finishing an HTML italic section.
211 212 213 214 215 216 217 218 219 |
# File 'lib/propolize.rb', line 211 def processSingleAsterix(match) if @italic then @writer.write("</i>") @italic = false else @writer.write("<i>") @italic = true end end |
#textNotYetParsed ⇒ Object
Having parsed some of the text, how much is left to be parsed?
287 288 289 |
# File 'lib/propolize.rb', line 287 def textNotYetParsed return @text[@pos..-1] end |