Class: JsDuck::Css::Lexer

Inherits:
Object
  • Object
show all
Defined in:
lib/jsduck/css/lexer.rb

Overview

Tokenizes CSS or SCSS code into lexical tokens.

Each token has a type and value. Types and possible values for them are as follows:

  • :number – “25.8”

  • :percentage – “25%”

  • :dimension – “2em”

  • :string – ‘“Hello world”’

  • :ident – “foo-bar”

  • :at_keyword – “@mixin”

  • :hash – “#00FF66”

  • :delim – “{”

  • :doc_comment – “/** My comment */”

Notice that doc-comments are recognized as tokens while normal comments are ignored just as whitespace.

Constant Summary collapse

IDENT =

Simplified token syntax based on: www.w3.org/TR/CSS21/syndata.html

/-?[_a-z][_a-z0-9-]*/i
NAME =
/[_a-z0-9-]+/i
NUM =
/[0-9]*\.[0-9]+|[0-9]+/

Instance Method Summary collapse

Constructor Details

#initialize(input) ⇒ Lexer

Initializes lexer with input string.



26
27
28
29
# File 'lib/jsduck/css/lexer.rb', line 26

def initialize(input)
  @input = StringScanner.new(input)
  @buffer = []
end

Instance Method Details

#buffer_tokens(n) ⇒ Object

Ensures next n tokens are read in buffer

At the end of buffering the initial position scanpointer is restored. Only the #next method will advance the scanpointer in a way that’s visible outside this class.



84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/jsduck/css/lexer.rb', line 84

def buffer_tokens(n)
  prev_pos = @input.pos
  @input.pos = @buffer.last[:pos] if @buffer.last
  (n - @buffer.length).times do
    @previous_token = tok = next_token
    if tok
      # remember scanpointer position after each token
      tok[:pos] = @input.pos
      @buffer << tok
    end
  end
  @input.pos = prev_pos
end

#empty?Boolean

True when no more tokens.

Returns:

  • (Boolean)


74
75
76
77
# File 'lib/jsduck/css/lexer.rb', line 74

def empty?
  buffer_tokens(1)
  return !@buffer.first
end

#look(*tokens) ⇒ Object

Tests if given pattern matches the tokens that follow at current position.

Takes list of strings and symbols. Symbols are compared to token type, while strings to token value. For example:

look(:ident, ":", :dimension)


39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# File 'lib/jsduck/css/lexer.rb', line 39

def look(*tokens)
  buffer_tokens(tokens.length)
  i = 0
  tokens.all? do |t|
    tok = @buffer[i]
    i += 1
    if !tok
      false
    elsif t.instance_of?(Symbol)
      tok[:type] == t
    else
      tok[:value] == t
    end
  end
end

#maybe(token_type, before_re, after_re) ⇒ Object

Returns token of given type when both regexes match. Otherwise returns :delim token with value of first regex match. First regex must always match.



175
176
177
178
179
180
181
182
183
184
185
186
187
188
# File 'lib/jsduck/css/lexer.rb', line 175

def maybe(token_type, before_re, after_re)
  before = @input.scan(before_re)
  if @input.check(after_re)
    return {
      :type => token_type,
      :value => before + @input.scan(after_re)
    }
  else
    return {
      :type => :delim,
      :value => before
    }
  end
end

#next(full = false) ⇒ Object

Returns the value of next token, moving the current token cursor also to next token.

When full=true, returns full token as hash like so:

{:type => :ident, :value => "foo"}

For doc-comments the full token also contains the field :linenr, pointing to the line where the doc-comment began.



65
66
67
68
69
70
71
# File 'lib/jsduck/css/lexer.rb', line 65

def next(full=false)
  buffer_tokens(1)
  tok = @buffer.shift
  # advance the scanpointer to the position after this token
  @input.pos = tok[:pos]
  full ? tok : tok[:value]
end

#next_tokenObject

Parses out next token from input stream.



99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# File 'lib/jsduck/css/lexer.rb', line 99

def next_token
  while !@input.eos? do
    skip_white
    if @input.check(IDENT)
      return {
        :type => :ident,
        :value => @input.scan(IDENT)
      }
    elsif @input.check(/'/)
      return {
        :type => :string,
        :value => @input.scan(/'([^'\\]|\\.)*('|\z)/m)
      }
    elsif @input.check(/"/)
      return {
        :type => :string,
        :value => @input.scan(/"([^"\\]|\\.)*("|\z)/m)
      }
    elsif @input.check(/\//)
      # Several things begin with dash:
      # - comments, regexes, division-operators
      if @input.check(/\/\*\*[^\/]/)
        return {
          :type => :doc_comment,
          # Calculate current line number, starting with 1
          :linenr => @input.string[0...@input.pos].count("\n") + 1,
          :value => @input.scan_until(/\*\/|\z/).sub(/\A\/\*\*/, "").sub(/\*\/\z/, "")
        }
      elsif @input.check(/\/\*/)
        # skip multiline comment
        @input.scan_until(/\*\/|\z/)
      elsif @input.check(/\/\//)
        # skip line comment
        @input.scan_until(/\n|\z/)
      else
        return {
          :type => :operator,
          :value => @input.scan(/\//)
        }
      end
    elsif @input.check(NUM)
      nr = @input.scan(NUM)
      if @input.check(/%/)
        return {
          :type => :percentage,
          :value => nr + @input.scan(/%/)
        }
      elsif @input.check(IDENT)
        return {
          :type => :dimension,
          :value => nr + @input.scan(IDENT)
        }
      else
        return {
          :type => :number,
          :value => nr
        }
      end
    elsif @input.check(/@/)
      return maybe(:at_keyword, /@/, IDENT)
    elsif @input.check(/#/)
      return maybe(:hash, /#/, NAME)
    elsif @input.check(/\$/)
      return maybe(:var, /\$/, IDENT)
    elsif @input.check(/./)
      return {
        :type => :delim,
        :value => @input.scan(/./)
      }
    end
  end
end

#skip_whiteObject



190
191
192
# File 'lib/jsduck/css/lexer.rb', line 190

def skip_white
  @input.scan(/\s+/)
end