pax_global_header00006660000000000000000000000064115000146510014503gustar00rootroot0000000000000052 comment=147de6c1f992611144ae3592becfbd7049d020fe rexical-1.0.5/000077500000000000000000000000001150001465100131355ustar00rootroot00000000000000rexical-1.0.5/.gitignore000066400000000000000000000000201150001465100151150ustar00rootroot00000000000000tags .*.swp pkg rexical-1.0.5/CHANGELOG.rdoc000066400000000000000000000005031150001465100152730ustar00rootroot00000000000000=== 1.0.5 / Not released * Bug fixes * Scanners with nested classes work better === 1.0.4 * Bug fixes * Generated tokenizer only tokenizes on pulls === 1.0.3 * Bug fixes * renamed to "Rexical" because someone already has "rex". === 1.0.2 * Bug fixes * Fixed nested macros so that backslashes will work rexical-1.0.5/DOCUMENTATION.en.rdoc000066400000000000000000000123321150001465100164210ustar00rootroot00000000000000 = REX: Ruby Lex for Racc == About Lexical Scanner Generator for Ruby, with Racc. == Usage rex [options] grammarfile -o --output-file filename designated output filename. -s --stub append stub main for debug. -i --ignorecase ignore char case -C --check-only syntax check only. --independent independent mode. -d --debug print debug information -h --help print usage. --version print version. --copyright print copyright. == Default Output Filename It destinate from foo.rex to foo.rex.rb. This name is for a follow description. require 'foo.rex' == Grammar File Format A definition is given in order of a header part, a rule part, and the footer part. One or more sections are included in a rule part. As for each section, the head of the sentence starts by the keyword. Summary: [Header Part] "class" Foo ["option" [options] ] ["inner" [methods] ] ["macro" [macro-name regular-expression] ] "rule" [start-state] pattern [actions] "end" [Footer Part] === Grammar File Example class Foo macro BLANK \s+ DIGIT \d+ rule {BLANK} {DIGIT} { [:NUMBER, text.to_i] } . { [text, text] } end == Header Part ( Optional ) All the contents described before the definition of a rule part are posted to head of the output file. == Footer Part ( Optional ) All the contents described after the definition of a rule part are posted to tail of the output file. == Rule Part Rule part is from the line which begins from the "class" keyword to the line which begins from the "end" keyword. The class name outputted after a keyword "class" is specified. If embellished with a module name, it will become a class in a module. The class which inherited Racc::Parser is generated. === Rule Header Example class Foo class Bar::Foo == Option Section ( Optional ) "option" is start keyword. "ignorecase" when pattern match, ignore char case. "stub" append stub main for debug. "independent" independent mode, for it is not inherited Racc. == Inner Section ( Optional ) "inner" is start keyword. The contents defined here are defined by the inside of the class of the generated scanner. == Macro Section ( Optional ) "macro" is start keyword. One regular expression is named. A blank character (0x20) can be included by escaping by \ . === Macro Section Example DIGIT \d+ IDENT [a-zA-Z_][a-zA-Z0-9_]* BLANK [\ \t]+ REMIN \/\* REMOUT \*\/ == Rule Section "rule" is start keyword. [state] pattern [actions] === state: Start State ( Optional ) A start state is expressed with the identifier which prefaces ":". When the continuing alphabetic character is a capital letter, it will be in an exclusive start state. When it is a small letter, it will be in an inclusive start state. Initial value and default value of a start state is nil. === pattern: String Pattern The regular expression for specifying a character string. The macro definition bundled with { } can be used for description of a regular expression. A macro definition is used in order to use a regular expression including a blank. === actions: Processing Actions ( Optional ) Action is performed when a pattern is suited. The processing which creates a suitable token is defined. The arrangement whose token has the second clause of classification and a value, or nil. The following elements can be used in order to create a token. lineno Line number ( Read Only ) text Matched string ( Read Only ) state Start state ( Read/Write ) Action is bundled with { }. It is the block of Ruby. Don't use the function to change the flow of control exceeding a block. ( return, exit, next, break, ... ) If action is omitted, the character string which matched will be canceled and will progress to the next scan. === Rule Part Example {REMIN} { state = :REM ; [:REM_IN, text] } :REM {REMOUT} { state = nil ; [:REM_OUT, text] } :REM (.+)(?={REMOUT}) { [:COMMENT, text] } {BLANK} -?{DIGIT} { [:NUMBER, text.to_i] } {WORD} { [:word, text] } . { [text, text] } == Comment ( Optional ) From "#" to the end of the line becomes a comment in each line. == Usage for Generated Class === scan_setup() The event for initializing at the time of the execution start of a scanner. It is redefined and used. === scan_str( str ) Parse the string described by the defined grammar. Token is stored in an inside. === scan_file( filename ) Parse the file described by the defined grammar. Token is stored in an inside. === next_token One token stored in the inside is taken out. The last returns nil. == Notice This specification is provisional and may be changed without a preliminary announcement. rexical-1.0.5/DOCUMENTATION.ja.rdoc000066400000000000000000000107241150001465100164140ustar00rootroot00000000000000 = REX: Ruby Lex for Racc == 概要 Racc と併用する Ruby 用の字句スキャナ生成ツール。 == 使い方 rex [options] grammarfile -o --output-file filename 出力ファイル名指定 -s --stub デバッグ用の主処理を付加 -i --ignorecase 大文字小文字を区別しない -C --check-only 文法検査のみ --independent 非依存モード -d --debug デバッグ情報表示 -h --help 使い方の説明 --version バージョン表明 --copyright 著作権情報表示 == デフォルトの出力ファイル名 foo.rex について foo.rex.rb を出力する。 以下のように利用されることを想定している。 require 'foo.rex' == 入力ファイル構造 頭部、規則部、脚部の順に定義する。 規則部には、複数の節が含まれる。 各節は、行頭がキーワードで始まる。 概要: [頭部] "class" Foo ["option" [オプション] ] ["inner" [メソッド定義] ] ["macro" [マクロ名 正規表現] ] "rule" [スタート状態] パターン [アクション] "end" [脚部] === 入力ファイル記述例 class Foo macro BLANK \s+ DIGIT \d+ rule {BLANK} {DIGIT} { [:NUMBER, text.to_i] } . { [text, text] } end == 頭部(省略可能) 規則部の定義以前に記述された内容は、すべて出力ファイル冒頭に転記される。 == 脚部(省略可能) 規則部の定義以降に記述された内容は、すべて出力ファイル末尾に転記される。 == 規則部 規則部は "class" キーワードから始まる行から "end" キーワードから始まる 行までである。 "class" キーワードに続けて出力するクラス名を指定する。 モジュール名で修飾すると、モジュール内クラスとなる。 Racc::Parser を継承したクラスを生成する。 === 規則部定義例 class Foo class Bar::Foo == オプション(省略可能) この節は "option" キーワードで始まる。 "ignorecase" 大文字小文字を区別しない。 "stub" デバッグ用の主処理を付加 "independent" 非依存モード。Racc を継承しない。 == 内部ユーザコード(省略可能) この節は "inner" キーワードで始まる。 ここで定義した内容は、生成したスキャナのクラスの内部で定義される。 == マクロ定義(省略可能) この節は "macro" キーワードで始まる。 一綴りの正規表現に名前をつける。 \ でエスケープすることで空白を含めることができる。 === マクロ定義例 DIGIT \d+ IDENT [a-zA-Z_][a-zA-Z0-9_]* BLANK [\ \t]+ REMIN \/\* REMOUT \*\/ == 走査規則 この節は "rule" キーワードで始まる。 [state] pattern [actions] === state: スタート状態(省略可能) スタート状態は ":" を前置する識別子で表される。 続く英字が大文字のとき、排他的スタート状態となる。 小文字のとき、包含的スタート状態となる。 スタート状態の初期値および省略時値は nil である。 === pattern: 文字列パターン 文字列を特定するための正規表現。 正規表現の記述には、括弧で括ったマクロ定義を用いることができる。 空白を含む正規表現を用いるには、マクロを使用する。 === actions: アクション(省略可能) パターンに適合するときアクションは実行される。 適切なトークンを作成する処理を定義する。 トークンは、種別と値の二項を持つ配列、または nil である。 トークンを作成するために以下の要素を利用できる。 lineno 入力行番号 ( Read Only ) text 検出した文字列 ( Read Only ) state スタート状態 ( Read/Write ) アクションは { } で括った Ruby のブロックである。 ブロックを越えて制御の流れを変える機能を使用してはいけない。 ( return, exit, next, break, ... ) アクションが省略されると、適合した文字列は破棄されて次の走査に進む。 === 走査規則定義例 {REMIN} { state = :REM ; [:REM_IN, text] } :REM {REMOUT} { state = nil ; [:REM_OUT, text] } :REM (.+)(?={REMOUT}) { [:COMMENT, text] } {BLANK} -?{DIGIT} { [:NUMBER, text.to_i] } {WORD} { [:word, text] } . { [text, text] } == コメント(省略可能) 各行において "#" から 行末までがコメントになる。 == 生成したクラスの使い方 === scan_setup() スキャナの実行開始時に初期化するためのイベント。 再定義して使用する。 === scan_str( str ) 定義された文法によって記述された文字列を解釈する。 token を内部に保持する。 === scan_file( filename ) 定義された文法によって記述されたファイルを読み込む。 token を内部に保持する。 === next_token 内部に保持する token をひとつずつ取り出す。 最後は nil を返す。 == 注意 本仕様は暫定的であり、予告なく変更される場合がある。 rexical-1.0.5/Manifest.txt000066400000000000000000000012071150001465100154440ustar00rootroot00000000000000CHANGELOG.rdoc DOCUMENTATION.en.rdoc DOCUMENTATION.ja.rdoc Manifest.txt README.ja README.rdoc Rakefile bin/rex lib/rexical.rb lib/rexical/generator.rb lib/rexical/rexcmd.rb sample/a.cmd sample/b.cmd sample/c.cmd sample/calc3.racc sample/calc3.rex sample/calc3.rex.rb sample/calc3.tab.rb sample/error1.rex sample/error2.rex sample/sample.html sample/sample.rex sample/sample.rex.rb sample/sample.xhtml sample/sample1.c sample/sample1.rex sample/sample2.bas sample/sample2.rex sample/simple.html sample/simple.xhtml sample/xhtmlparser.racc sample/xhtmlparser.rex test/assets/test.rex test/rex-20060125.rb test/rex-20060511.rb test/test_generator.rb rexical-1.0.5/README.ja000066400000000000000000000033221150001465100144060ustar00rootroot00000000000000Rexical README =========== Rexical は Ruby のためのスキャナジェネレータです。 lex の Ruby 版に相当します。 Racc とともに使うように設計されています。 必要環境 -------- * ruby 1.8 以降 インストール ------------ パッケージのトップディレクトリで次のように入力してください。 ($ は通常ユーザ、# はルートのプロンプトです) $ ruby setup.rb config $ ruby setup.rb setup ($ su) # ruby setup.rb install これで通常のパスに Racc がインストールされます。自分の好き なディレクトリにインストールしたいときは、setup.rb config に 各種オプションをつけて実行してください。オプションのリストは $ ruby setup.rb --help で見られます。 テスト ------ sample/ 以下にいくつか Rexical の文法ファイルのサンプルが用意 してあります。以下を実行してください。 $ rex sample1.rex --stub $ ruby sample1.rex.rb sample1.c $ rex sample2.rex --stub $ ruby sample2.rex.rb sample2.bas $ racc calc3.racc $ rex calc3.rex $ ruby calc3.tab.rb Rexical の詳しい文法は doc/ ディレクトリ以下を見てください。 また記述例は sample/ ディレクトリ以下を見てください。 ライセンス ---------- ライセンスは GNU Lesser General Public License (LGPL) version 2 です。ただしユーザが書いた規則ファイルや、Racc がそこから生成した Ruby スクリプトはその対象外です。好きなライセンスで配布してください。 バグなど -------- Rexical を使っていてバグらしき現象に遭遇したら、下記のアドレスまで メールをください。 そのときはできるだけバグを再現できる文法ファイルを付けてください。 ARIMA Yasuhiro arima.yasuhiro@nifty.com http://raa.ruby-lang.org/project/rex/ rexical-1.0.5/README.rdoc000066400000000000000000000020161150001465100147420ustar00rootroot00000000000000= Rexical * http://github.com/tenderlove/rexical/tree/master == DESCRIPTION Rexical is a lexical scanner generator. It is written in Ruby itself, and generates Ruby program. It is designed for use with Racc. == SYNOPSIS Here is a sample lexical definition: class Sample macro BLANK [\ \t]+ rule BLANK # no action \d+ { [:digit, text.to_i] } \w+ { [:word, text] } \n . { [text, text] } end Here is the command line usage: $ rex sample1.rex --stub $ ruby sample1.rex.rb sample1.c $ rex sample2.rex --stub $ ruby sample2.rex.rb sample2.bas $ racc calc3.racc $ rex calc3.rex $ ruby calc3.tab.rb == REQUIREMENTS * ruby version 1.8.x or later. == INSTALL * sudo gem install rexical == LICENSE Rexical is distributed under the terms of the GNU Lesser General Public License version 2. Note that you do NOT need to follow LGPL for your own parser (Rexical outputs). You can provide those files under any licenses you want. rexical-1.0.5/Rakefile000066400000000000000000000011621150001465100146020ustar00rootroot00000000000000# -*- ruby -*- require 'rubygems' require 'hoe' Hoe.plugin :debugging Hoe.plugin :git Hoe.plugins.delete :rubyforge Hoe.spec 'rexical' do self.readme_file = 'README.rdoc' self.history_file = 'CHANGELOG.rdoc' developer('Aaron Patterson', 'aaronp@rubyforge.org') self.rubyforge_name = 'ruby-rex' self.extra_rdoc_files = FileList['*.rdoc'] end namespace :gem do namespace :spec do task :dev do File.open("#{HOE.name}.gemspec", 'w') do |f| HOE.spec.version = "#{HOE.version}.#{Time.now.strftime("%Y%m%d%H%M%S")}" f.write(HOE.spec.to_ruby) end end end end # vim: syntax=Ruby rexical-1.0.5/bin/000077500000000000000000000000001150001465100137055ustar00rootroot00000000000000rexical-1.0.5/bin/rex000066400000000000000000000006751150001465100144360ustar00rootroot00000000000000#!/usr/bin/env ruby # # rex # # Copyright (c) 2005-2006 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # For details of LGPL, see the file "COPYING". # ## --------------------------------------------------------------------- require 'rubygems' require 'rexical' Rexical::Cmd.new.run rexical-1.0.5/lib/000077500000000000000000000000001150001465100137035ustar00rootroot00000000000000rexical-1.0.5/lib/rexical.rb000066400000000000000000000002751150001465100156630ustar00rootroot00000000000000require 'rexical/generator' require 'rexical/rexcmd' module Rexical VERSION = "1.0.5" Copyright = 'Copyright (c) 2005-2006 ARIMA Yasuhiro' Mailto = 'arima.yasuhiro@nifty.com' end rexical-1.0.5/lib/rexical/000077500000000000000000000000001150001465100153325ustar00rootroot00000000000000rexical-1.0.5/lib/rexical/generator.rb000066400000000000000000000270171150001465100176540ustar00rootroot00000000000000# # generator.rb # # Copyright (c) 2005-2006 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU Lesser General Public License version 2 or later. # require 'strscan' module Rexical ## --------------------------------------------------------------------- class ParseError < StandardError ; end ## --------------------------------------------------------------------- class Generator ## --------------------------------------------------------------------- attr_accessor :grammar_file attr_accessor :grammar_lines attr_accessor :scanner_file attr_accessor :module_name attr_accessor :class_name attr_accessor :lineno attr_accessor :rules attr_accessor :exclusive_states attr_accessor :ignorecase attr_accessor :independent attr_accessor :debug ## --------------------------------------------------------------------- def initialize(opts) @lineno = 0 @macro = {} @rules = [] @exclusive_states = [nil] @grammar_lines = nil @scanner_header = "" @scanner_footer = "" @scanner_inner = "" @opt = opts end ## --------------------------------------------------------------------- def add_header( st ) @scanner_header += "#{st}\n" end ## --------------------------------------------------------------------- def add_footer( st ) @scanner_footer += "#{st}\n" end ## --------------------------------------------------------------------- def add_inner( st ) @scanner_inner += "#{st}\n" end ## --------------------------------------------------------------------- def add_option( st ) opts = st.split opts.each do |opt| case opt when /ignorecase/i @opt['--ignorecase'] = true when /stub/i @opt['--stub'] = true when /independent/i @opt['--independent'] = true end end end ## --------------------------------------------------------------------- def add_macro( st ) ss = StringScanner.new(st) ss.scan(/\s+/) key = ss.scan(/\S+/) ss.scan(/\s+/) st = ss.post_match len = st.size ndx = 0 while ndx <= len c = st[ndx,1] ndx += 1 case c when '\\' ndx += 1 next when '#', ' ' ndx -= 1 break end end expr = st[0,ndx] expr.gsub!('\ ', ' ') key = '{' + key + '}' @macro.each_pair do |k, e| expr.gsub!(k) { |m| e } end @macro[key] = expr rescue raise ParseError, "parse error in add_macro:'#{st}'" end ## --------------------------------------------------------------------- def add_rule( rule_state, rule_expr, rule_action=nil ) st = rule_expr.dup @macro.each_pair do |k, e| rule_expr.gsub!(k) { |m| e } end if rule_state.to_s[1,1] =~ /[A-Z]/ @exclusive_states << rule_state unless @exclusive_states.include?(rule_state) exclusive_state = rule_state start_state = nil else exclusive_state = nil start_state = rule_state end rule = [exclusive_state, start_state, rule_expr, rule_action] @rules << rule rescue raise ParseError, "parse error in add_rule:'#{st}'" end def read_grammar @grammar_lines = StringScanner.new File.read(grammar_file) end def next_line @lineno += 1 @grammar_lines.scan_until(/\n/).chomp rescue nil end def parse state1 = :HEAD state2 = nil state3 = nil lastmodes = [] while st = next_line case state1 when :FOOT add_footer st when :HEAD ss = StringScanner.new(st) if ss.scan(/class/) state1 = :CLASS st = ss.post_match.strip @class_name = st else add_header st end when :CLASS s = st.strip next if s.size == 0 or s[0,1] == '#' ss = StringScanner.new(st) if ss.scan(/option.*$/) state2 = :OPTION next end if ss.scan(/inner.*$/) state2 = :INNER next end if ss.scan(/macro.*$/) state2 = :MACRO next end if ss.scan(/rule.*$/) state2 = :RULE next end if ss.scan(/end.*$/) state1 = :FOOT next end case state2 when :OPTION add_option st when :INNER add_inner st when :MACRO add_macro st when :RULE case state3 when nil rule_state, rule_expr, rule_action = parse_rule(st) if rule_action =~ /\s*\{/ lastmodes = parse_action(rule_action, lastmodes) if lastmodes.empty? add_rule rule_state, rule_expr, rule_action else state3 = :CONT rule_action += "\n" end else add_rule rule_state, rule_expr end when :CONT rule_action += "#{st}\n" lastmodes = parse_action(st, lastmodes) if lastmodes.empty? state3 = nil add_rule rule_state, rule_expr, rule_action else end end # case state3 end # case state2 end # case state1 end # while end ## --------------------------------------------------------------------- def parse_rule(st) st.strip! return if st.size == 0 or st[0,1] == '#' ss = StringScanner.new(st) ss.scan(/\s+/) rule_state = ss.scan(/\:\S+/) ss.scan(/\s+/) rule_expr = ss.scan(/\S+/) ss.scan(/\s+/) [rule_state, rule_expr, ss.post_match] end ## --------------------------------------------------------------------- def parse_action(st, lastmodes=[]) modes = lastmodes mode = lastmodes[-1] ss = StringScanner.new(st) until ss.eos? c = ss.scan(/./) case c when '#' if (mode == :brace) or (mode == nil) #p [c, mode, modes] return modes end when '{' if (mode == :brace) or (mode == nil) mode = :brace modes.push mode end when '}' if (mode == :brace) modes.pop mode = modes[0] end when "'" if (mode == :brace) mode = :quote modes.push mode elsif (mode == :quote) modes.pop mode = modes[0] end when '"' if (mode == :brace) mode = :doublequote modes.push mode elsif (mode == :doublequote) modes.pop mode = modes[0] end when '`' if (mode == :brace) mode = :backquote modes.push mode elsif (mode == :backquote) modes.pop mode = modes[0] end end end #p [c, mode, modes] return modes end ## --------------------------------------------------------------------- REX_HEADER = <<-REX_EOT #-- # DO NOT MODIFY!!!! # This file is automatically generated by rex %s # from lexical definition file "%s". #++ REX_EOT REX_UTIL = <<-REX_EOT require 'strscan' class ScanError < StandardError ; end attr_reader :lineno attr_reader :filename attr_accessor :state def scan_setup(str) @ss = StringScanner.new(str) @lineno = 1 @state = nil end def action yield end def scan_str(str) scan_setup(str) do_parse end alias :scan :scan_str def load_file( filename ) @filename = filename open(filename, "r") do |f| scan_setup(f.read) end end def scan_file( filename ) load_file(filename) do_parse end REX_EOT REX_STUB = <<-REX_EOT if __FILE__ == $0 exit if ARGV.size != 1 filename = ARGV.shift rex = %s.new begin rex.load_file filename while token = rex.next_token p token end rescue $stderr.printf %s, rex.filename, rex.lineno, $!.message end end REX_EOT ## --------------------------------------------------------------------- def scanner_io unless scanner_file = @opt['--output-file'] scanner_file = grammar_file + ".rb" end f = File.open(scanner_file, 'w') end private :scanner_io def write_scanner f = scanner_io ## scan flag flag = "" flag += "i" if @opt['--ignorecase'] ## header f.printf REX_HEADER, Rexical::VERSION, grammar_file unless @opt['--independent'] f.printf "require 'racc/parser'\n" end @scanner_header.each_line do |s| f.print s end if @opt['--independent'] f.puts "class #{@class_name}" else f.puts "class #{@class_name} < Racc::Parser" end ## utility method f.print REX_UTIL ## scanner method f.print <<-REX_EOT def next_token return if @ss.eos? # skips empty actions until token = _next_token or @ss.eos?; end token end def _next_token text = @ss.peek(1) @lineno += 1 if text == "\\n" token = case @state REX_EOT exclusive_states.each do |es| f.printf <<-REX_EOT when #{es ? es.to_s : "nil"} case REX_EOT rules.each do |rule| exclusive_state, start_state, rule_expr, rule_action = *rule if es == exclusive_state if rule_action if start_state f.print <<-REX_EOT when((state == #{start_state}) and (text = @ss.scan(/#{rule_expr}/#{flag}))) action #{rule_action} REX_EOT else f.print <<-REX_EOT when (text = @ss.scan(/#{rule_expr}/#{flag})) action #{rule_action} REX_EOT end else if start_state f.print <<-REX_EOT when (state == #{start_state}) and (text = @ss.scan(/#{rule_expr}/#{flag})) ; REX_EOT else f.print <<-REX_EOT when (text = @ss.scan(/#{rule_expr}/#{flag})) ; REX_EOT end end end end f.print <<-REX_EOT else text = @ss.string[@ss.pos .. -1] raise ScanError, "can not match: '" + text + "'" end # if REX_EOT end f.print <<-REX_EOT else raise ScanError, "undefined state: '" + state.to_s + "'" end # case state REX_EOT if @opt['--debug'] f.print <<-REX_EOT p token REX_EOT end f.print <<-REX_EOT token end # def _next_token REX_EOT ## inner method @scanner_inner.each_line do |s| f.print s end f.puts "end # class" ## footer @scanner_footer.each_line do |s| f.print s end # case ## stub main f.printf REX_STUB, @class_name, '"%s:%d:%s\n"' if @opt['--stub'] f.close end ## def end ## class end ## module ## --------------------------------------------------------------------- ## test if __FILE__ == $0 rex = Rexical::Generator.new(nil) rex.grammar_file = "sample.rex" rex.read_grammar rex.parse rex.write_scanner end rexical-1.0.5/lib/rexical/rexcmd.rb000066400000000000000000000065601150001465100171500ustar00rootroot00000000000000# # rexcmd.rb # # Copyright (c) 2005-2006 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # For details of LGPL, see the file "COPYING". # ## --------------------------------------------------------------------- require 'getoptlong' module Rexical class Cmd OPTIONS = <<-EOT o -o --output-file file name of output [.rb] o -s --stub - append stub code for debug o -i --ignorecase - ignore char case o -C --check-only - syntax check only o - --independent - independent mode o -d --debug - print debug information o -h --help - print this message and quit o - --version - print version and quit o - --copyright - print copyright and quit EOT def run @status = 1 usage 'no grammar file given' if ARGV.empty? usage 'too many grammar files given' if ARGV.size > 1 filename = ARGV[0] rex = Rexical::Generator.new(@opt) begin rex.grammar_file = filename rex.read_grammar rex.parse if @opt['--check-only'] $stderr.puts "syntax ok" return 0 end rex.write_scanner @status = 0 rescue Rexical::ParseError, Errno::ENOENT msg = $!.to_s unless /\A\d/ === msg msg[0,0] = ' ' end $stderr.puts "#{@cmd}:#{rex.grammar_file}:#{rex.lineno}:#{msg}" ensure exit @status end end def initialize @status = 2 @cmd = File.basename($0, ".rb") tmp = OPTIONS.lines.collect do |line| next if /\A\s*\z/ === line disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) a = [] a.push lopt unless lopt == '-' a.push sopt unless sopt == '-' a.push takearg == '-' ? GetoptLong::NO_ARGUMENT : GetoptLong::REQUIRED_ARGUMENT a end getopt = GetoptLong.new(*tmp.compact) getopt.quiet = true @opt = {} begin getopt.each do |name, arg| raise GetoptLong::InvalidOption, "#{@cmd}: #{name} given twice" if @opt.key? name @opt[name] = arg.empty? ? true : arg end rescue GetoptLong::AmbigousOption, GetoptLong::InvalidOption, GetoptLong::MissingArgument, GetoptLong::NeedlessArgument usage $!.message end usage if @opt['--help'] if @opt['--version'] puts "#{@cmd} version #{Rexical::VERSION}" exit 0 end if @opt['--copyright'] puts "#{@cmd} version #{Rexical::VERSION}" puts "#{Rexical::Copyright} <#{Rexical::Mailto}>" exit 0 end end def usage( msg=nil ) f = $stderr f.puts "#{@cmd}: #{msg}" if msg f.print <<-EOT Usage: #{@cmd} [options] Options: EOT OPTIONS.each_line do |line| next if line.strip.empty? if /\A\s*\z/ === line f.puts next end disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) if disp == 'o' sopt = nil if sopt == '-' lopt = nil if lopt == '-' opt = [sopt, lopt].compact.join(',') takearg = nil if takearg == '-' opt = [opt, takearg].compact.join(' ') f.printf "%-27s %s\n", opt, doc end end exit @status end end end rexical-1.0.5/sample/000077500000000000000000000000001150001465100144165ustar00rootroot00000000000000rexical-1.0.5/sample/a.cmd000066400000000000000000000000401150001465100153150ustar00rootroot00000000000000call rex xhtmlparser.rex -s %* rexical-1.0.5/sample/b.cmd000066400000000000000000000000421150001465100153200ustar00rootroot00000000000000call racc xhtmlparser.racc -v %* rexical-1.0.5/sample/c.cmd000066400000000000000000000002511150001465100153230ustar00rootroot00000000000000:ruby xhtmlparser.tab.rb simple.html %* :ruby xhtmlparser.tab.rb simple.xhtml %* :ruby xhtmlparser.tab.rb sample.html %* ruby xhtmlparser.tab.rb sample.xhtml %* rexical-1.0.5/sample/calc3.racc000066400000000000000000000014701150001465100162370ustar00rootroot00000000000000# # A simple calculator, version 3. # class Calculator3 prechigh nonassoc UMINUS left '*' '/' left '+' '-' preclow options no_result_var rule target : exp | /* none */ { 0 } exp : exp '+' exp { val[0] + val[2] } | exp '-' exp { val[0] - val[2] } | exp '*' exp { val[0] * val[2] } | exp '/' exp { val[0] / val[2] } | '(' exp ')' { val[1] } | '-' NUMBER =UMINUS { -(val[1]) } | NUMBER end ---- header ---- # # generated by racc # require 'calc3.rex' ---- inner ---- ---- footer ---- puts 'sample calc' puts '"q" to quit.' calc = Calculator3.new while true print '>>> '; $stdout.flush str = $stdin.gets.strip break if /q/i === str begin p calc.scan_str(str) rescue ParseError puts 'parse error' end end rexical-1.0.5/sample/calc3.rex000066400000000000000000000003311150001465100161200ustar00rootroot00000000000000# # calc3.rex # lexical scanner definition for rex # class Calculator3 macro BLANK \s+ DIGIT \d+ rule {BLANK} {DIGIT} { [:NUMBER, text.to_i] } .|\n { [text, text] } inner end rexical-1.0.5/sample/calc3.rex.rb000066400000000000000000000035121150001465100165260ustar00rootroot00000000000000# # DO NOT MODIFY!!!! # This file is automatically generated by rex 1.0.0 # from lexical definition file "calc3.rex". # require 'racc/parser' # # calc3.rex # lexical scanner definition for rex # class Calculator3 < Racc::Parser require 'strscan' class ScanError < StandardError ; end attr_reader :lineno attr_reader :filename def scan_setup ; end def action &block yield end def scan_str( str ) scan_evaluate str do_parse end def load_file( filename ) @filename = filename open(filename, "r") do |f| scan_evaluate f.read end end def scan_file( filename ) load_file filename do_parse end def next_token @rex_tokens.shift end def scan_evaluate( str ) scan_setup @rex_tokens = [] @lineno = 1 ss = StringScanner.new(str) state = nil until ss.eos? text = ss.peek(1) @lineno += 1 if text == "\n" case state when nil case when (text = ss.scan(/\s+/)) ; when (text = ss.scan(/\d+/)) @rex_tokens.push action { [:NUMBER, text.to_i] } when (text = ss.scan(/.|\n/)) @rex_tokens.push action { [text, text] } else text = ss.string[ss.pos .. -1] raise ScanError, "can not match: '" + text + "'" end # if else raise ScanError, "undefined state: '" + state.to_s + "'" end # case state end # until ss end # def scan_evaluate end # class if __FILE__ == $0 exit if ARGV.size != 1 filename = ARGV.shift rex = Calculator3.new begin rex.load_file filename while token = rex.next_token p token end rescue $stderr.printf "%s:%d:%s\n", rex.filename, rex.lineno, $!.message end end rexical-1.0.5/sample/calc3.tab.rb000066400000000000000000000070631150001465100165030ustar00rootroot00000000000000# # DO NOT MODIFY!!!! # This file is automatically generated by racc 1.4.4 # from racc grammer file "calc3.racc". # require 'racc/parser' # # generated by racc # require 'calc3.rex' class Calculator3 < Racc::Parser ##### racc 1.4.4 generates ### racc_reduce_table = [ 0, 0, :racc_error, 1, 11, :_reduce_none, 0, 11, :_reduce_2, 3, 12, :_reduce_3, 3, 12, :_reduce_4, 3, 12, :_reduce_5, 3, 12, :_reduce_6, 3, 12, :_reduce_7, 2, 12, :_reduce_8, 1, 12, :_reduce_none ] racc_reduce_n = 10 racc_shift_n = 19 racc_action_table = [ 7, 8, 9, 10, 6, 18, 3, 4, 11, 5, 7, 8, 9, 10, 3, 4, 13, 5, 3, 4, nil, 5, 3, 4, nil, 5, 3, 4, nil, 5, 3, 4, nil, 5, 7, 8, 7, 8 ] racc_action_check = [ 12, 12, 12, 12, 1, 12, 10, 10, 3, 10, 2, 2, 2, 2, 0, 0, 6, 0, 4, 4, nil, 4, 9, 9, nil, 9, 8, 8, nil, 8, 7, 7, nil, 7, 17, 17, 16, 16 ] racc_action_pointer = [ 8, 4, 7, -1, 12, nil, 16, 24, 20, 16, 0, nil, -3, nil, nil, nil, 33, 31, nil ] racc_action_default = [ -2, -10, -1, -10, -10, -9, -10, -10, -10, -10, -10, -8, -10, 19, -5, -6, -3, -4, -7 ] racc_goto_table = [ 2, 1, nil, nil, 12, nil, nil, 14, 15, 16, 17 ] racc_goto_check = [ 2, 1, nil, nil, 2, nil, nil, 2, 2, 2, 2 ] racc_goto_pointer = [ nil, 1, 0 ] racc_goto_default = [ nil, nil, nil ] racc_token_table = { false => 0, Object.new => 1, :UMINUS => 2, "*" => 3, "/" => 4, "+" => 5, "-" => 6, "(" => 7, ")" => 8, :NUMBER => 9 } racc_use_result_var = false racc_nt_base = 10 Racc_arg = [ racc_action_table, racc_action_check, racc_action_default, racc_action_pointer, racc_goto_table, racc_goto_check, racc_goto_default, racc_goto_pointer, racc_nt_base, racc_reduce_table, racc_token_table, racc_shift_n, racc_reduce_n, racc_use_result_var ] Racc_token_to_s_table = [ '$end', 'error', 'UMINUS', '"*"', '"/"', '"+"', '"-"', '"("', '")"', 'NUMBER', '$start', 'target', 'exp'] Racc_debug_parser = false ##### racc system variables end ##### # reduce 0 omitted # reduce 1 omitted module_eval <<'.,.,', 'calc3.racc', 13 def _reduce_2( val, _values) 0 end .,., module_eval <<'.,.,', 'calc3.racc', 15 def _reduce_3( val, _values) val[0] + val[2] end .,., module_eval <<'.,.,', 'calc3.racc', 16 def _reduce_4( val, _values) val[0] - val[2] end .,., module_eval <<'.,.,', 'calc3.racc', 17 def _reduce_5( val, _values) val[0] * val[2] end .,., module_eval <<'.,.,', 'calc3.racc', 18 def _reduce_6( val, _values) val[0] / val[2] end .,., module_eval <<'.,.,', 'calc3.racc', 19 def _reduce_7( val, _values) val[1] end .,., module_eval <<'.,.,', 'calc3.racc', 20 def _reduce_8( val, _values) -(val[1]) end .,., # reduce 9 omitted def _reduce_none( val, _values) val[0] end end # class Calculator3 puts 'sample calc' puts '"q" to quit.' calc = Calculator3.new while true print '>>> '; $stdout.flush str = $stdin.gets.strip break if /q/i === str begin p calc.scan_str(str) rescue ParseError puts 'parse error' end end rexical-1.0.5/sample/error1.rex000066400000000000000000000003671150001465100163560ustar00rootroot00000000000000# # eooro1.rex # lexical definition sample for rex # class Error1 macro BLANK [\ \t]+ rule BLANK # no action \d+ { [:digit, text.to_i] } \w+ { [:word, text] } \n # . { [text, text] } end rexical-1.0.5/sample/error2.rex000066400000000000000000000004101150001465100163440ustar00rootroot00000000000000# # error2.rex # lexical definition sample for rex # class Error2 macro BLANK [\ \t]+ rule BLANK # no action \d+ { [:digit, text.to_i] } \w+ { [:word, text] } \n . { state = :NONDEF ; [text, text] } end rexical-1.0.5/sample/sample.html000066400000000000000000000014671150001465100165750ustar00rootroot00000000000000 Title

HTML 4.01

rexical-1.0.5/sample/sample.rex000066400000000000000000000003661150001465100164240ustar00rootroot00000000000000# # sample.rex # lexical definition sample for rex # class Sample macro BLANK [\ \t]+ rule BLANK # no action \d+ { [:digit, text.to_i] } \w+ { [:word, text] } \n . { [text, text] } end rexical-1.0.5/sample/sample.rex.rb000066400000000000000000000037211150001465100170240ustar00rootroot00000000000000# # DO NOT MODIFY!!!! # This file is automatically generated by rex 1.0.0 # from lexical definition file "sample.rex". # require 'racc/parser' # # sample.rex # lexical definition sample for rex # class Sample < Racc::Parser require 'strscan' class ScanError < StandardError ; end attr_reader :lineno attr_reader :filename def scan_setup ; end def action &block yield end def scan_str( str ) scan_evaluate str do_parse end def load_file( filename ) @filename = filename open(filename, "r") do |f| scan_evaluate f.read end end def scan_file( filename ) load_file filename do_parse end def next_token @rex_tokens.shift end def scan_evaluate( str ) scan_setup @rex_tokens = [] @lineno = 1 ss = StringScanner.new(str) state = nil until ss.eos? text = ss.peek(1) @lineno += 1 if text == "\n" case state when nil case when (text = ss.scan(/BLANK/)) ; when (text = ss.scan(/\d+/)) @rex_tokens.push action { [:digit, text.to_i] } when (text = ss.scan(/\w+/)) @rex_tokens.push action { [:word, text] } when (text = ss.scan(/\n/)) ; when (text = ss.scan(/./)) @rex_tokens.push action { [text, text] } else text = ss.string[ss.pos .. -1] raise ScanError, "can not match: '" + text + "'" end # if else raise ScanError, "undefined state: '" + state.to_s + "'" end # case state end # until ss end # def scan_evaluate end # class if __FILE__ == $0 exit if ARGV.size != 1 filename = ARGV.shift rex = Sample.new begin rex.load_file filename while token = rex.next_token p token end rescue $stderr.printf "%s:%d:%s\n", rex.filename, rex.lineno, $!.message end end rexical-1.0.5/sample/sample.xhtml000066400000000000000000000016001150001465100167520ustar00rootroot00000000000000 Title

XHTML 1.1

rexical-1.0.5/sample/sample1.c000066400000000000000000000001721150001465100161240ustar00rootroot00000000000000 int main(int argc, char **argv) { /* block remark */ int i = 100; // inline remark printf("hello, world\n"); } rexical-1.0.5/sample/sample1.rex000066400000000000000000000020251150001465100164770ustar00rootroot00000000000000# # sample1.rex # lexical definition sample for rex # # usage # rex sample1.rex --stub # ruby sample1.rex.rb sample1.c # class Sample1 macro BLANK \s+ REM_IN \/\* REM_OUT \*\/ REM \/\/ rule # [:state] pattern [actions] # remark {REM_IN} { state = :REMS; [:rem_in, text] } :REMS {REM_OUT} { state = nil; [:rem_out, text] } :REMS .*(?={REM_OUT}) { [:remark, text] } {REM} { state = :REM; [:rem_in, text] } :REM \n { state = nil; [:rem_out, text] } :REM .*(?=$) { [:remark, text] } # literal \"[^"]*\" { [:string, text] } # " \'[^']\' { [:character, text] } # ' # skip {BLANK} # no action # numeric \d+ { [:digit, text.to_i] } # identifier \w+ { [:word, text] } . { [text, text] } end rexical-1.0.5/sample/sample2.bas000066400000000000000000000000671150001465100164530ustar00rootroot00000000000000' inline remark i = 100 input st print "hello, world" rexical-1.0.5/sample/sample2.rex000066400000000000000000000014431150001465100165030ustar00rootroot00000000000000# # sample2.rex # lexical definition sample for rex # # usage # rex sample2.rex --stub # ruby sample2.rex.rb sample2.bas # class Sample2 option ignorecase macro BLANK \s+ REMARK \' # ' rule {REMARK} { state = :REM; [:rem_in, text] } # ' :REM \n { state = nil; [:rem_out, text] } :REM .*(?=$) { [:remark, text] } \"[^"]*\" { [:string, text] } # " {BLANK} # no action INPUT { [:input, text] } PRINT { [:print, text] } \d+ { [:digit, text.to_i] } \w+ { [:word, text] } . { [text, text] } end rexical-1.0.5/sample/simple.html000066400000000000000000000001071150001465100165730ustar00rootroot00000000000000

Hello World.

rexical-1.0.5/sample/simple.xhtml000066400000000000000000000003751150001465100167720ustar00rootroot00000000000000

XHTML 1.1

rexical-1.0.5/sample/xhtmlparser.racc000066400000000000000000000024331150001465100176230ustar00rootroot00000000000000# # xml parser # class XHTMLParser rule target : /* none */ | xml_doc xml_doc : xml_header extra xml_body | xml_header xml_body | xml_body xml_header : xtag_in element attributes xtag_out xml_body : tag_from contents tag_to tag_from : tag_in element attributes tag_out tag_empty : tag_in element attributes etag_out tag_to : etag_in element tag_out attributes : /* none */ | attributes attribute attribute : attr equal quoted quoted : quote1 value quote1 | quote2 value quote2 contents : /* none */ | contents content content : text | extra | tag_from contents tag_to | tag_empty extra : tag_in ext extra_texts tag_out extra_texts : /* none */ | extra_texts rem_in remtexts rem_out | extra_texts exttext remtexts : remtext | remtexts remtext end ---- header ---- # # generated by racc # require 'xhtmlparser.rex' ---- inner ---- ---- footer ---- exit if ARGV.size == 0 filename = ARGV.shift htmlparser = XHTMLParser.new htmlparser.scan_file filename rexical-1.0.5/sample/xhtmlparser.rex000066400000000000000000000052121150001465100175070ustar00rootroot00000000000000# # xhtmlparser.rex # lexical scanner definition for rex # # usage # rex xhtmlparser.rex --stub # ruby xhtmlparser.rex.rb sample.xhtml # class XHTMLParser option ignorecase macro BLANK \s+ TAG_IN \< TAG_OUT \> ETAG_IN \<\/ ETAG_OUT \/\> XTAG_IN \<\? XTAG_OUT \?\> EXT \! REM \-\- EQUAL \= Q1 \' Q2 \" rule # [:state] pattern [actions] {XTAG_IN} { state = :TAG; [:xtag_in, text] } {ETAG_IN} { state = :TAG; [:etag_in, text] } {TAG_IN} { state = :TAG; [:tag_in, text] } :TAG {EXT} { state = :EXT; [:ext, text] } :EXT {REM} { state = :REM; [:rem_in, text] } :EXT {XTAG_OUT} { state = nil; [:xtag_out, text] } :EXT {TAG_OUT} { state = nil; [:tag_out, text] } :EXT .+(?={REM}) { [:exttext, text] } :EXT .+(?={TAG_OUT}) { [:exttext, text] } :EXT .+(?=$) { [:exttext, text] } :EXT \n :REM {REM} { state = :EXT; [:rem_out, text] } :REM .+(?={REM}) { [:remtext, text] } :REM .+(?=$) { [:remtext, text] } :REM \n :TAG {BLANK} :TAG {XTAG_OUT} { state = nil; [:xtag_out, text] } :TAG {ETAG_OUT} { state = nil; [:etag_out, text] } :TAG {TAG_OUT} { state = nil; [:tag_out, text] } :TAG {EQUAL} { [:equal, text] } :TAG {Q1} { state = :Q1; [:quote1, text] } # ' :Q1 {Q1} { state = :TAG; [:quote1, text] } # ' :Q1 [^{Q1}]+(?={Q1}) { [:value, text] } # ' :TAG {Q2} { state = :Q2; [:quote2, text] } # " :Q2 {Q2} { state = :TAG; [:quote2, text] } # " :Q2 [^{Q2}]+(?={Q2}) { [:value, text] } # " :TAG [\w\-]+(?={EQUAL}) { [:attr, text] } :TAG [\w\-]+ { [:element, text] } \s+(?=\S) .*\S(?=\s*{ETAG_IN}) { [:text, text] } .*\S(?=\s*{TAG_IN}) { [:text, text] } .*\S(?=\s*$) { [:text, text] } \s+(?=$) inner end rexical-1.0.5/test/000077500000000000000000000000001150001465100141145ustar00rootroot00000000000000rexical-1.0.5/test/assets/000077500000000000000000000000001150001465100154165ustar00rootroot00000000000000rexical-1.0.5/test/assets/test.rex000066400000000000000000000002421150001465100171130ustar00rootroot00000000000000module A module B class C < SomethingElse macro w [\s\r\n\f]* # [:state] pattern [actions] {w}~={w} { [:INCLUDES, text] } end rexical-1.0.5/test/rex-20060125.rb000066400000000000000000000074431150001465100161440ustar00rootroot00000000000000#!C:/Program Files/ruby-1.8/bin/ruby # # rex # # Copyright (c) 2005 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # For details of LGPL, see the file "COPYING". # ## --------------------------------------------------------------------- REX_OPTIONS = <<-EOT o -o --output-file file name of output [.rb] o -s --stub - append stub code for debug o -i --ignorecase - ignore char case o -C --check-only - syntax check only o - --independent - independent mode o -d --debug - print debug information o -h --help - print this message and quit o - --version - print version and quit o - --copyright - print copyright and quit EOT ## --------------------------------------------------------------------- require 'getoptlong' require 'rex/generator' require 'rex/info' ## --------------------------------------------------------------------- =begin class Rex def initialize end end =end def main $cmd = File.basename($0, ".rb") opt = get_options filename = ARGV[0] rex = Rex::Generator.new(opt) begin rex.grammar_file = filename rex.read_grammar rex.parse if opt['--check-only'] $stderr.puts "syntax ok" return 0 end rex.write_scanner rescue Rex::ParseError, Errno::ENOENT msg = $!.to_s unless /\A\d/ === msg msg[0,0] = ' ' end $stderr.puts "#{$cmd}:#{rex.grammar_file}:#{rex.lineno}:#{msg}" return 1 end return 0 end ## --------------------------------------------------------------------- def get_options tmp = REX_OPTIONS.collect do |line| next if /\A\s*\z/ === line disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) a = [] a.push lopt unless lopt == '-' a.push sopt unless sopt == '-' a.push takearg == '-' ? GetoptLong::NO_ARGUMENT : GetoptLong::REQUIRED_ARGUMENT a end getopt = GetoptLong.new(*tmp.compact) getopt.quiet = true opt = {} begin getopt.each do |name, arg| raise GetoptLong::InvalidOption, "#{$cmd}: #{name} given twice" if opt.key? name opt[name] = arg.empty? ? true : arg end rescue GetoptLong::AmbigousOption, GetoptLong::InvalidOption, GetoptLong::MissingArgument, GetoptLong::NeedlessArgument usage 1, $!.message end usage if opt['--help'] if opt['--version'] puts "#{$cmd} version #{Rex::VERSION}" exit 0 end if opt['--copyright'] puts "#{$cmd} version #{Rex::VERSION}" puts "#{Rex::Copyright} <#{Rex::Mailto}>" exit 0 end usage(1, 'no grammar file given') if ARGV.empty? usage(1, 'too many grammar files given') if ARGV.size > 1 opt end ## --------------------------------------------------------------------- def usage(status=0, msg=nil ) f = (status == 0 ? $stdout : $stderr) f.puts "#{$cmd}: #{msg}" if msg f.print <<-EOT Usage: #{$cmd} [options] Options: EOT REX_OPTIONS.each do |line| next if line.strip.empty? if /\A\s*\z/ === line f.puts next end disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) if disp == 'o' sopt = nil if sopt == '-' lopt = nil if lopt == '-' opt = [sopt, lopt].compact.join(',') takearg = nil if takearg == '-' opt = [opt, takearg].compact.join(' ') f.printf "%-27s %s\n", opt, doc end end exit status end ## --------------------------------------------------------------------- main rexical-1.0.5/test/rex-20060511.rb000066400000000000000000000072741150001465100161450ustar00rootroot00000000000000#!/usr/local/bin/ruby # # rex # # Copyright (c) 2005-2006 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # For details of LGPL, see the file "COPYING". # ## --------------------------------------------------------------------- REX_OPTIONS = <<-EOT o -o --output-file file name of output [.rb] o -s --stub - append stub code for debug o -i --ignorecase - ignore char case o -C --check-only - syntax check only o - --independent - independent mode o -d --debug - print debug information o -h --help - print this message and quit o - --version - print version and quit o - --copyright - print copyright and quit EOT ## --------------------------------------------------------------------- require 'getoptlong' require 'rex/generator' require 'rex/info' ## --------------------------------------------------------------------- class RexRunner def run @status = 1 usage 'no grammar file given' if ARGV.empty? usage 'too many grammar files given' if ARGV.size > 1 filename = ARGV[0] rex = Rex::Generator.new(@opt) begin rex.grammar_file = filename rex.read_grammar rex.parse if @opt['--check-only'] $stderr.puts "syntax ok" return 0 end rex.write_scanner @status = 0 rescue Rex::ParseError, Errno::ENOENT msg = $!.to_s unless /\A\d/ === msg msg[0,0] = ' ' end $stderr.puts "#{@cmd}:#{rex.grammar_file}:#{rex.lineno}:#{msg}" ensure exit @status end end def initialize @status = 2 @cmd = File.basename($0, ".rb") tmp = REX_OPTIONS.collect do |line| next if /\A\s*\z/ === line disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) a = [] a.push lopt unless lopt == '-' a.push sopt unless sopt == '-' a.push takearg == '-' ? GetoptLong::NO_ARGUMENT : GetoptLong::REQUIRED_ARGUMENT a end getopt = GetoptLong.new(*tmp.compact) getopt.quiet = true @opt = {} begin getopt.each do |name, arg| raise GetoptLong::InvalidOption, "#{@cmd}: #{name} given twice" if @opt.key? name @opt[name] = arg.empty? ? true : arg end rescue GetoptLong::AmbigousOption, GetoptLong::InvalidOption, GetoptLong::MissingArgument, GetoptLong::NeedlessArgument usage $!.message end usage if @opt['--help'] if @opt['--version'] puts "#{@cmd} version #{Rex::VERSION}" exit 0 end if @opt['--copyright'] puts "#{@cmd} version #{Rex::VERSION}" puts "#{Rex::Copyright} <#{Rex::Mailto}>" exit 0 end end def usage( msg=nil ) f = $stderr f.puts "#{@cmd}: #{msg}" if msg f.print <<-EOT Usage: #{@cmd} [options] Options: EOT REX_OPTIONS.each do |line| next if line.strip.empty? if /\A\s*\z/ === line f.puts next end disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) if disp == 'o' sopt = nil if sopt == '-' lopt = nil if lopt == '-' opt = [sopt, lopt].compact.join(',') takearg = nil if takearg == '-' opt = [opt, takearg].compact.join(' ') f.printf "%-27s %s\n", opt, doc end end exit @status end end RexRunner.new.run rexical-1.0.5/test/test_generator.rb000066400000000000000000000117141150001465100174720ustar00rootroot00000000000000require 'test/unit' require 'tempfile' require 'rexical' require 'stringio' class TestGenerator < Test::Unit::TestCase def test_header_is_written_after_module rex = Rexical::Generator.new( "--independent" => true ) rex.grammar_file = File.join File.dirname(__FILE__), 'assets', 'test.rex' rex.read_grammar rex.parse output = StringIO.new rex.write_scanner output comments = [] output.string.split(/[\n]/).each do |line| comments << line.chomp if line =~ /^#/ end assert_match 'DO NOT MODIFY', comments.join assert_equal '#--', comments.first assert_equal '#++', comments.last end def test_read_non_existent_file rex = Rexical::Generator.new(nil) rex.grammar_file = 'non_existent_file' assert_raises Errno::ENOENT do rex.read_grammar end end def test_scanner_nests_classes source = parse_lexer %q{ module Foo class Baz::Calculator < Bar rule \d+ { [:NUMBER, text.to_i] } \s+ { [:S, text] } end end } assert_match 'Baz::Calculator < Bar', source end def test_scanner_inherits source = parse_lexer %q{ class Calculator < Bar rule \d+ { [:NUMBER, text.to_i] } \s+ { [:S, text] } end } assert_match 'Calculator < Bar', source end def test_scanner_inherits_many_levels source = parse_lexer %q{ class Calculator < Foo::Bar rule \d+ { [:NUMBER, text.to_i] } \s+ { [:S, text] } end } assert_match 'Calculator < Foo::Bar', source end def test_simple_scanner m = build_lexer %q{ class Calculator rule \d+ { [:NUMBER, text.to_i] } \s+ { [:S, text] } end } calc = m::Calculator.new calc.scan_setup('1 2 10') assert_tokens [[:NUMBER, 1], [:S, ' '], [:NUMBER, 2], [:S, ' '], [:NUMBER, 10]], calc end def test_simple_scanner_with_empty_action m = build_lexer %q{ class Calculator rule \d+ { [:NUMBER, text.to_i] } \s+ # skips whitespaces end } calc = m::Calculator.new calc.scan_setup('1 2 10') assert_tokens [[:NUMBER, 1], [:NUMBER, 2], [:NUMBER, 10]], calc end def test_parses_macros_with_escapes source = parse_lexer %q{ class Foo macro w [\ \t]+ rule {w} { [:SPACE, text] } end } assert source.index('@ss.scan(/[ \t]+/))') end def test_simple_scanner_with_macros m = build_lexer %q{ class Calculator macro digit \d+ rule {digit} { [:NUMBER, text.to_i] } \s+ { [:S, text] } end } calc = m::Calculator.new calc.scan_setup('1 2 10') assert_tokens [[:NUMBER, 1], [:S, ' '], [:NUMBER, 2], [:S, ' '], [:NUMBER, 10]], calc end def test_nested_macros source = parse_lexer %q{ class Calculator macro nonascii [^\0-\177] string "{nonascii}*" rule {string} { [:STRING, text] } end } assert_match '"[^\0-\177]*"', source end def test_more_nested_macros source = parse_lexer %q{ class Calculator macro nonascii [^\0-\177] sing {nonascii}* string "{sing}" rule {string} { [:STRING, text] } end } assert_match '"[^\0-\177]*"', source end def test_changing_state_during_lexing lexer = build_lexer %q{ class Calculator rule a { self.state = :B ; [:A, text] } :B b { self.state = nil ; [:B, text] } end } calc1 = lexer::Calculator.new calc2 = lexer::Calculator.new calc1.scan_setup('aaaaa') calc2.scan_setup('ababa') # Doesn't lex all 'a's assert_raise(lexer::Calculator::ScanError) { tokens(calc1) } # Does lex alternating 'a's and 'b's calc2.scan_setup('ababa') assert_tokens [[:A, 'a'], [:B, 'b'], [:A, 'a'], [:B, 'b'], [:A, 'a']], calc2 end def test_changing_state_is_possible_between_next_token_calls lexer = build_lexer %q{ class Calculator rule a { [:A, text] } :B b { [:B, text] } end } calc = lexer::Calculator.new calc.scan_setup('ababa') assert_equal [:A, 'a'], calc.next_token calc.state = :B assert_equal [:B, 'b'], calc.next_token calc.state = nil assert_equal [:A, 'a'], calc.next_token calc.state = :B assert_equal [:B, 'b'], calc.next_token calc.state = nil assert_equal [:A, 'a'], calc.next_token end def parse_lexer(str) rex = Rexical::Generator.new("--independent" => true) out = StringIO.new rex.grammar_lines = StringScanner.new(str) rex.parse rex.write_scanner(out) out.string end def build_lexer(str) mod = Module.new mod.module_eval(parse_lexer(str)) mod end def tokens(scanner) tokens = [] while token = scanner.next_token tokens << token end tokens end def assert_tokens(expected, scanner) assert_equal expected, tokens(scanner) end end