Class: String
Overview
This is an extension and modification of the standard String class. We do a lot of UTF-8 character processing in the parser. Ruby 1.8 does not have good enough UTF-8 support and Ruby 1.9 only handles UTF-8 characters as Strings. This is very inefficient compared to representing them as Integer objects. Some of these hacks can be removed once we have switched to 1.9 support only.
Instance Method Summary collapse
-
#<<(obj) ⇒ Object
Replacement for the existing << operator that also works for characters above Integer 255 (UTF-8 characters).
-
#forceUTF8Encoding ⇒ Object
Ensure the String is really UTF-8 encoded and newlines are only n.
- #ljust(len, pad = ' ') ⇒ Object
- #old_double_left_angle ⇒ Object
- #old_reverse ⇒ Object
-
#reverse ⇒ Object
UTF-8 aware version of reverse that replaces the built-in one.
- #to_base64 ⇒ Object
- #to_quoted_printable ⇒ Object
- #unix2dos ⇒ Object
Instance Method Details
#<<(obj) ⇒ Object
Replacement for the existing << operator that also works for characters above Integer 255 (UTF-8 characters).
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/taskjuggler/UTF8String.rb', line 62 def <<(obj) if obj.is_a?(String) || (obj < 256) # In this case we can use the built-in concat. concat(obj) else # UTF-8 characters have a maximum length of 4 byte and no byte is 0. mask = 0xFF000000 pos = 3 while pos >= 0 # Use the built-in concat operator for each byte. concat((obj & mask) >> (8 * pos)) if (obj & mask) != 0 # Move mask and position to the next byte. mask = mask >> 8 pos -= 1 end end end |
#forceUTF8Encoding ⇒ Object
Ensure the String is really UTF-8 encoded and newlines are only n. If that’s not possible, an Encoding::UndefinedConversionError is raised.
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
# File 'lib/taskjuggler/UTF8String.rb', line 122 def forceUTF8Encoding if RUBY_VERSION < '1.9.0' # Ruby 1.8 really only support 7 bit ASCII well. Only do the line-end # clean-up. gsub(/\r\n/, "\n") else begin # Ensure that the text has LF line ends and is UTF-8 encoded. encode('UTF-8', :universal_newline => true) rescue # The encoding of the String is broken. Find the first broken line and # report it. lineCtr = 1 each_line do |line| begin line.encode('UTF-8') rescue line = line.encode('UTF-8', :invalid => :replace, :undef => :replace, :replace => '<?>') raise Encoding::UndefinedConversionError, "UTF-8 encoding error in line #{lineCtr}: #{line}" end lineCtr += 1 end end end end |
#ljust(len, pad = ' ') ⇒ Object
89 90 91 92 |
# File 'lib/taskjuggler/UTF8String.rb', line 89 def ljust(len, pad = ' ') return self + pad * (len - length_utf8) if length_utf8 < len self end |
#old_double_left_angle ⇒ Object
58 |
# File 'lib/taskjuggler/UTF8String.rb', line 58 alias old_double_left_angle << |
#old_reverse ⇒ Object
94 |
# File 'lib/taskjuggler/UTF8String.rb', line 94 alias old_reverse reverse |
#reverse ⇒ Object
UTF-8 aware version of reverse that replaces the built-in one.
97 98 99 100 101 |
# File 'lib/taskjuggler/UTF8String.rb', line 97 def reverse a = [] each_utf8_char { |c| a << c } a.reverse.join end |
#to_base64 ⇒ Object
112 113 114 |
# File 'lib/taskjuggler/UTF8String.rb', line 112 def to_base64 Base64.encode64(self) end |
#to_quoted_printable ⇒ Object
108 109 110 |
# File 'lib/taskjuggler/UTF8String.rb', line 108 def to_quoted_printable [self].pack('M').gsub(/\n/, "\r\n") end |
#unix2dos ⇒ Object
116 117 118 |
# File 'lib/taskjuggler/UTF8String.rb', line 116 def unix2dos gsub(/\n/, "\r\n") end |