A BBcode parser designed to be used with Ruby on Rails
A bbcode parser gem you can include in your rails app to parse bbcode-formatted strings to HTML or any other format you like.
The bbcode gem consists of 4 parts:
- The
Tokenizer-class, which converts the bbcode-formatted string to a stream of tokens. - The
Parser-class, which attempts to pair bbcode tags to bbcode elements. - The
Handler-class, which converts bbcode elements anyway you like. - The
Helpers-module, which adds a method to String, allowing you to convert bbcode-formatted strings with a registered handler.
Additionally, a HtmlHandler class is available. This class is a Handler
designed to convert bbcode elements to HTML more easily.
Installation:
Add the gem to the gemfile of your project:
gem "th-bbcode", "~> 0.4.0"
Usage:
Create and register a handler. In this example, I'm creating a HtmlHandler and
I'm going to register it as :html.
require 'rubygems'
require 'bundler/setup'
require 'bbcode'
Bbcode::Base.register_handler :html, Bbcode::HtmlHandler.new(
:b => :strong,
:i => :em,
:url => [ :a, { :href => "%{0}" } ],
:txt => ->(element){ "#{element.content.source}" },
:img => ->(element){ %(<img src="#{CGI.escapeHTML(element.content.source)}">) },
:admin => ->(element, locals){ locals[:is_admin] ? element.content : "" },
:color => [ :span, { :style => "color: %{0};" } ]
)
That's it! You can now parse any string as bbcode and convert it to html with
the :html-handler like this:
"[b]Hello, bold world![/]".as_bbcode.to :html
# => <strong>Hello, bold world!</strong>
"[admin]Hello, admin![/]".as_bbcode.to :html, :is_admin => true
# => Hello, admin!
"[admin]Hello, admin![/]".as_bbcode.to :html, :is_admin => false
# =>
If you're using this gem in a rails project, I would recommend registering your handlers in an initializer.
See examples in spec/ folder for detailed examples of usage.
Features:
- Parsing regular bbcode tags like
[b]and[/b]. - Parsing anonymous closing bbcode tags like
[/]. - Parsing bbcode tags with arguments like
[a=foo, bar],[a foo=1 bar:2],[a=foo, bar bar:1 foo=2]and[a="foo" b='bar']. - Parsing nested bbcode elements like
[b]bold[i]and italic[/]only bold[/], which might result to<b>bold<i>and italic</i>only bold</b>. - Parsing incorrectly nested bbcode elements like
[b]bold[i]and italic[/b]only italic[/], which might result to<b>bold<i>and italic</i></b><i>only italic</i>. - Passing variables to the handler and accessing them to the element handler callbacks.
Using WillScanString:
You might want to convert URLs in the message to be converted to a hyperlink,
or you might want smileys in your bbcode message. This can be done by
defining a :"#text"-handler in your Handler or HtmlHandler.
I personally used WillScanString's StringScanner class to achieve this:
# Requires the will_scan_string gem
require "will_scan_string"
string_scanner = WillScanString::StringScanner.new
string_scanner.register_replacement "<", "<"
string_scanner.register_replacement ">", ">"
string_scanner.register_replacement "&", "&"
string_scanner.register_replacement "\"", """
string_scanner.register_replacement /(?:\r\n|\r|\n)/, "<br>"
string_scanner.register_replacement /[^\s@]+@[^\s@]/, ->(email){ %(<a href="mailto:#{CGI.escapeHTML(email)}">#{CGI.escapeHTML(email)}</a>) }
# +handler+ is your Bbcode::HtmlHandler or Bbcode::Handler instance
handler.register_element_handler :"#text", ->(text){ string_scanner.replace(text) }
The above example converts newlines to <br>-tags, escapes HTML entities
and converts e-mail addresses to clickable mailto hyperlinks.
Note: By overwriting the :"#text"-handler in the HtmlHandler, html entities
are no longer replaced automatically: You need to escape them in your handler
callback yourself. Failing to do so might expose your website to XSS
vulnerabilities.
Todo:
- An easier way to handle text around bbcode tags to, for example, add smileys
and wrap hyperlinks to URLs. Currently, the only way to achieve this is by
adding a
:"#text"-handler to your handler and adding the functionality yourself. (note: See the above note regarding WillScanString) - An easier way to include the content, source or content-source in the
HtmlHandler-class. - Review handleability of element interrupts.
- Review regular expression matching bbcode tags to allow tags having names
containing characters other than
A-Z,0-9,_and-, possibly based on the current registered tags. - Add CDATA-like feature for bbcode tags to allow tags to be ignored within
certain elements. Useful for
[code]-tags. - Add a default handler with the most common bbcode tags.