Class: JsDuck::Css::Lexer
- Inherits:
-
Object
- Object
- JsDuck::Css::Lexer
- Defined in:
- lib/jsduck/css/lexer.rb
Overview
Tokenizes CSS or SCSS code into lexical tokens.
Each token has a type and value. Types and possible values for them are as follows:
-
:number – “25.8”
-
:percentage – “25%”
-
:dimension – “2em”
-
:string – ‘“Hello world”’
-
:ident – “foo-bar”
-
:at_keyword – “@mixin”
-
:hash – “#00FF66”
-
:delim – “{”
-
:doc_comment – “/** My comment */”
Notice that doc-comments are recognized as tokens while normal comments are ignored just as whitespace.
Constant Summary collapse
- IDENT =
Simplified token syntax based on: www.w3.org/TR/CSS21/syndata.html
/-?[_a-z][_a-z0-9-]*/i
- NAME =
/[_a-z0-9-]+/i
- NUM =
/[0-9]*\.[0-9]+|[0-9]+/
Instance Method Summary collapse
-
#buffer_tokens(n) ⇒ Object
Ensures next n tokens are read in buffer.
-
#empty? ⇒ Boolean
True when no more tokens.
-
#initialize(input) ⇒ Lexer
constructor
Initializes lexer with input string.
-
#look(*tokens) ⇒ Object
Tests if given pattern matches the tokens that follow at current position.
-
#maybe(token_type, before_re, after_re) ⇒ Object
Returns token of given type when both regexes match.
-
#next(full = false) ⇒ Object
Returns the value of next token, moving the current token cursor also to next token.
-
#next_token ⇒ Object
Parses out next token from input stream.
- #skip_white ⇒ Object
Constructor Details
#initialize(input) ⇒ Lexer
Initializes lexer with input string.
26 27 28 29 |
# File 'lib/jsduck/css/lexer.rb', line 26 def initialize(input) @input = StringScanner.new(input) @buffer = [] end |
Instance Method Details
#buffer_tokens(n) ⇒ Object
Ensures next n tokens are read in buffer
At the end of buffering the initial position scanpointer is restored. Only the #next method will advance the scanpointer in a way that’s visible outside this class.
84 85 86 87 88 89 90 91 92 93 94 95 96 |
# File 'lib/jsduck/css/lexer.rb', line 84 def buffer_tokens(n) prev_pos = @input.pos @input.pos = @buffer.last[:pos] if @buffer.last (n - @buffer.length).times do @previous_token = tok = next_token if tok # remember scanpointer position after each token tok[:pos] = @input.pos @buffer << tok end end @input.pos = prev_pos end |
#empty? ⇒ Boolean
True when no more tokens.
74 75 76 77 |
# File 'lib/jsduck/css/lexer.rb', line 74 def empty? buffer_tokens(1) return !@buffer.first end |
#look(*tokens) ⇒ Object
Tests if given pattern matches the tokens that follow at current position.
Takes list of strings and symbols. Symbols are compared to token type, while strings to token value. For example:
look(:ident, ":", :dimension)
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# File 'lib/jsduck/css/lexer.rb', line 39 def look(*tokens) buffer_tokens(tokens.length) i = 0 tokens.all? do |t| tok = @buffer[i] i += 1 if !tok false elsif t.instance_of?(Symbol) tok[:type] == t else tok[:value] == t end end end |
#maybe(token_type, before_re, after_re) ⇒ Object
Returns token of given type when both regexes match. Otherwise returns :delim token with value of first regex match. First regex must always match.
175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
# File 'lib/jsduck/css/lexer.rb', line 175 def maybe(token_type, before_re, after_re) before = @input.scan(before_re) if @input.check(after_re) return { :type => token_type, :value => before + @input.scan(after_re) } else return { :type => :delim, :value => before } end end |
#next(full = false) ⇒ Object
Returns the value of next token, moving the current token cursor also to next token.
When full=true, returns full token as hash like so:
{:type => :ident, :value => "foo"}
For doc-comments the full token also contains the field :linenr, pointing to the line where the doc-comment began.
65 66 67 68 69 70 71 |
# File 'lib/jsduck/css/lexer.rb', line 65 def next(full=false) buffer_tokens(1) tok = @buffer.shift # advance the scanpointer to the position after this token @input.pos = tok[:pos] full ? tok : tok[:value] end |
#next_token ⇒ Object
Parses out next token from input stream.
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
# File 'lib/jsduck/css/lexer.rb', line 99 def next_token while !@input.eos? do skip_white if @input.check(IDENT) return { :type => :ident, :value => @input.scan(IDENT) } elsif @input.check(/'/) return { :type => :string, :value => @input.scan(/'([^'\\]|\\.)*('|\z)/m) } elsif @input.check(/"/) return { :type => :string, :value => @input.scan(/"([^"\\]|\\.)*("|\z)/m) } elsif @input.check(/\//) # Several things begin with dash: # - comments, regexes, division-operators if @input.check(/\/\*\*[^\/]/) return { :type => :doc_comment, # Calculate current line number, starting with 1 :linenr => @input.string[0...@input.pos].count("\n") + 1, :value => @input.scan_until(/\*\/|\z/).sub(/\A\/\*\*/, "").sub(/\*\/\z/, "") } elsif @input.check(/\/\*/) # skip multiline comment @input.scan_until(/\*\/|\z/) elsif @input.check(/\/\//) # skip line comment @input.scan_until(/\n|\z/) else return { :type => :operator, :value => @input.scan(/\//) } end elsif @input.check(NUM) nr = @input.scan(NUM) if @input.check(/%/) return { :type => :percentage, :value => nr + @input.scan(/%/) } elsif @input.check(IDENT) return { :type => :dimension, :value => nr + @input.scan(IDENT) } else return { :type => :number, :value => nr } end elsif @input.check(/@/) return maybe(:at_keyword, /@/, IDENT) elsif @input.check(/#/) return maybe(:hash, /#/, NAME) elsif @input.check(/\$/) return maybe(:var, /\$/, IDENT) elsif @input.check(/./) return { :type => :delim, :value => @input.scan(/./) } end end end |
#skip_white ⇒ Object
190 191 192 |
# File 'lib/jsduck/css/lexer.rb', line 190 def skip_white @input.scan(/\s+/) end |