Class: Prism::Source
- Inherits:
-
Object
- Object
- Prism::Source
- Defined in:
- lib/prism/parse_result.rb,
ext/prism/extension.c
Overview
This represents a source of Ruby code that has been parsed. It is used in conjunction with locations to allow them to resolve line numbers and source ranges.
Direct Known Subclasses
Instance Attribute Summary collapse
-
#offsets ⇒ Object
readonly
The list of newline byte offsets in the source code.
-
#source ⇒ Object
readonly
The source code that this source object represents.
-
#start_line ⇒ Object
readonly
The line number where this source starts.
Class Method Summary collapse
-
.for(source, start_line = 1, offsets = []) ⇒ Object
Create a new source object with the given source code.
Instance Method Summary collapse
-
#character_column(byte_offset) ⇒ Object
Return the column number in characters for the given byte offset.
-
#character_offset(byte_offset) ⇒ Object
Return the character offset for the given byte offset.
-
#code_units_cache(encoding) ⇒ Object
Generate a cache that targets a specific encoding for calculating code unit offsets.
-
#code_units_column(byte_offset, encoding) ⇒ Object
Returns the column number in code units for the given encoding for the given byte offset.
-
#code_units_offset(byte_offset, encoding) ⇒ Object
Returns the offset from the start of the file for the given byte offset counting in code units for the given encoding.
-
#column(byte_offset) ⇒ Object
Return the column number for the given byte offset.
-
#deep_freeze ⇒ Object
Freeze this object and the objects it contains.
-
#encoding ⇒ Object
Returns the encoding of the source code, which is set by parameters to the parser or by the encoding magic comment.
-
#initialize(source, start_line = 1, offsets = []) ⇒ Source
constructor
Create a new source object with the given source code.
-
#line(byte_offset) ⇒ Object
Binary search through the offsets to find the line number for the given byte offset.
-
#line_end(byte_offset) ⇒ Object
Returns the byte offset of the end of the line corresponding to the given byte offset.
-
#line_start(byte_offset) ⇒ Object
Return the byte offset of the start of the line corresponding to the given byte offset.
-
#lines ⇒ Object
Returns the lines of the source code as an array of strings.
-
#replace_offsets(offsets) ⇒ Object
Replace the value of offsets with the given value.
-
#replace_start_line(start_line) ⇒ Object
Replace the value of start_line with the given value.
-
#slice(byte_offset, length) ⇒ Object
Perform a byteslice on the source code using the given byte offset and byte length.
Constructor Details
#initialize(source, start_line = 1, offsets = []) ⇒ Source
Create a new source object with the given source code.
45 46 47 48 49 |
# File 'lib/prism/parse_result.rb', line 45 def initialize(source, start_line = 1, offsets = []) @source = source @start_line = start_line # set after parsing is done @offsets = offsets # set after parsing is done end |
Instance Attribute Details
#offsets ⇒ Object (readonly)
The list of newline byte offsets in the source code.
42 43 44 |
# File 'lib/prism/parse_result.rb', line 42 def offsets @offsets end |
#source ⇒ Object (readonly)
The source code that this source object represents.
36 37 38 |
# File 'lib/prism/parse_result.rb', line 36 def source @source end |
#start_line ⇒ Object (readonly)
The line number where this source starts.
39 40 41 |
# File 'lib/prism/parse_result.rb', line 39 def start_line @start_line end |
Class Method Details
.for(source, start_line = 1, offsets = []) ⇒ Object
Create a new source object with the given source code. This method should be used instead of ‘new` and it will return either a `Source` or a specialized and more performant `ASCIISource` if no multibyte characters are present in the source code.
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
# File 'lib/prism/parse_result.rb', line 12 def self.for(source, start_line = 1, offsets = []) if source.ascii_only? ASCIISource.new(source, start_line, offsets) elsif source.encoding == Encoding::BINARY source.force_encoding(Encoding::UTF_8) if source.valid_encoding? new(source, start_line, offsets) else # This is an extremely niche use case where the file is marked as # binary, contains multi-byte characters, and those characters are not # valid UTF-8. In this case we'll mark it as binary and fall back to # treating everything as a single-byte character. This _may_ cause # problems when asking for code units, but it appears to be the # cleanest solution at the moment. source.force_encoding(Encoding::BINARY) ASCIISource.new(source, start_line, offsets) end else new(source, start_line, offsets) end end |
Instance Method Details
#character_column(byte_offset) ⇒ Object
Return the column number in characters for the given byte offset.
107 108 109 |
# File 'lib/prism/parse_result.rb', line 107 def character_column(byte_offset) character_offset(byte_offset) - character_offset(line_start(byte_offset)) end |
#character_offset(byte_offset) ⇒ Object
Return the character offset for the given byte offset.
102 103 104 |
# File 'lib/prism/parse_result.rb', line 102 def character_offset(byte_offset) (source.byteslice(0, byte_offset) or raise).length end |
#code_units_cache(encoding) ⇒ Object
Generate a cache that targets a specific encoding for calculating code unit offsets.
135 136 137 |
# File 'lib/prism/parse_result.rb', line 135 def code_units_cache(encoding) CodeUnitsCache.new(source, encoding) end |
#code_units_column(byte_offset, encoding) ⇒ Object
Returns the column number in code units for the given encoding for the given byte offset.
141 142 143 |
# File 'lib/prism/parse_result.rb', line 141 def code_units_column(byte_offset, encoding) code_units_offset(byte_offset, encoding) - code_units_offset(line_start(byte_offset), encoding) end |
#code_units_offset(byte_offset, encoding) ⇒ Object
Returns the offset from the start of the file for the given byte offset counting in code units for the given encoding.
This method is tested with UTF-8, UTF-16, and UTF-32. If there is the concept of code units that differs from the number of characters in other encodings, it is not captured here.
We purposefully replace invalid and undefined characters with replacement characters in this conversion. This happens for two reasons. First, it’s possible that the given byte offset will not occur on a character boundary. Second, it’s possible that the source code will contain a character that has no equivalent in the given encoding.
123 124 125 126 127 128 129 130 131 |
# File 'lib/prism/parse_result.rb', line 123 def code_units_offset(byte_offset, encoding) byteslice = (source.byteslice(0, byte_offset) or raise).encode(encoding, invalid: :replace, undef: :replace) if encoding == Encoding::UTF_16LE || encoding == Encoding::UTF_16BE byteslice.bytesize / 2 else byteslice.length end end |
#column(byte_offset) ⇒ Object
Return the column number for the given byte offset.
97 98 99 |
# File 'lib/prism/parse_result.rb', line 97 def column(byte_offset) byte_offset - line_start(byte_offset) end |
#deep_freeze ⇒ Object
Freeze this object and the objects it contains.
146 147 148 149 150 |
# File 'lib/prism/parse_result.rb', line 146 def deep_freeze source.freeze offsets.freeze freeze end |
#encoding ⇒ Object
Returns the encoding of the source code, which is set by parameters to the parser or by the encoding magic comment.
63 64 65 |
# File 'lib/prism/parse_result.rb', line 63 def encoding source.encoding end |
#line(byte_offset) ⇒ Object
Binary search through the offsets to find the line number for the given byte offset.
80 81 82 |
# File 'lib/prism/parse_result.rb', line 80 def line(byte_offset) start_line + find_line(byte_offset) end |
#line_end(byte_offset) ⇒ Object
Returns the byte offset of the end of the line corresponding to the given byte offset.
92 93 94 |
# File 'lib/prism/parse_result.rb', line 92 def line_end(byte_offset) offsets[find_line(byte_offset) + 1] || source.bytesize end |
#line_start(byte_offset) ⇒ Object
Return the byte offset of the start of the line corresponding to the given byte offset.
86 87 88 |
# File 'lib/prism/parse_result.rb', line 86 def line_start(byte_offset) offsets[find_line(byte_offset)] end |
#lines ⇒ Object
Returns the lines of the source code as an array of strings.
68 69 70 |
# File 'lib/prism/parse_result.rb', line 68 def lines source.lines end |
#replace_offsets(offsets) ⇒ Object
Replace the value of offsets with the given value.
57 58 59 |
# File 'lib/prism/parse_result.rb', line 57 def replace_offsets(offsets) @offsets.replace(offsets) end |
#replace_start_line(start_line) ⇒ Object
Replace the value of start_line with the given value.
52 53 54 |
# File 'lib/prism/parse_result.rb', line 52 def replace_start_line(start_line) @start_line = start_line end |
#slice(byte_offset, length) ⇒ Object
Perform a byteslice on the source code using the given byte offset and byte length.
74 75 76 |
# File 'lib/prism/parse_result.rb', line 74 def slice(byte_offset, length) source.byteslice(byte_offset, length) or raise end |