Class: XMLLexer

Inherits:
Object
  • Object
show all
Includes:
LogUtils
Defined in:
lib/sitac_lexer.rb

Overview

Note:

This lexer is auto-generated from a semantics file

XML lexer from given set of syntax

Author:

  • PRV

Constant Summary

Constants included from LogUtils

LogUtils::Log

Instance Method Summary collapse

Constructor Details

#initialize(file, syntax = 'ntk') ⇒ XMLLexer

Initialize the lexer

Parameters:

  • file (String)

    the file to read the SITAC from

  • syntax (String) (defaults to: 'ntk')

    the syntax to use (ntk/melissa)



16
17
18
19
20
21
22
23
24
25
# File 'lib/sitac_lexer.rb', line 16

def initialize(file, syntax = 'ntk')
  @rules = []
  @semset = syntax == 'ntk' ? NTKSemantics.new.regexes : nil
  Log.info('Creating lexer', 'CoMe_Lexer')
  @code = File.read(file)
  @code = @code.to_s.gsub(/\s{2,}/, "\n")
  gen_lexer
  @tokens = []
  Log.info('Lexer created', 'CoMe_Lexer')
end

Instance Method Details

#gen_lexerObject

Generate the lexer from the syntax file



36
37
38
39
40
41
42
43
# File 'lib/sitac_lexer.rb', line 36

def gen_lexer
  # syntax file to tokens
  @semset.each do |key, value|
    token ":#{key}" => value
  end
rescue StandardError
  Log.err('Error while generating lexer', 'CoMe_Lexer')
end

#get_tokensObject

Get the tokens from the code



46
47
48
49
50
51
52
53
54
55
# File 'lib/sitac_lexer.rb', line 46

def get_tokens
  # get matches for rules
  @rules.each do |rule, regex|
    # for each match, create a token
    @tokens << Token.new(rule, @code.scan(regex), 0)

    Log.info("Read #{@tokens.length} lexems from #{@code.length} bytes of code.", 'CoMe_Lexer', "\r")
  end
  @tokens
end

#token(tk) ⇒ Object

add a token to the rules

Parameters:

  • tk (Token)

    the token to add



29
30
31
32
33
# File 'lib/sitac_lexer.rb', line 29

def token tk
  token = tk.keys.first
  pattern = tk.values.first
  @rules << [token, pattern]
end

#tokenizeObject

Tokenize the code



58
59
60
61
62
63
# File 'lib/sitac_lexer.rb', line 58

def tokenize
  Log.info('Tokenizing', 'CoMe_Lexer')
  get_tokens
  Log.info("Tokenized #{@tokens.length} tokens", 'CoMe_Lexer')
  @tokens
end