Regexp-Optimizer-0.23/000755 000765 000024 00000000000 12113046242 015563 5ustar00dankogaistaff000000 000000 Regexp-Optimizer-0.23/Changes000644 000765 000024 00000005017 12113046200 017053 0ustar00dankogaistaff000000 000000 # # $Id: Changes,v 0.23 2013/02/26 05:47:41 dankogai Exp dankogai $ # $Revision: 0.23 $ $Date: 2013/02/26 05:47:41 $ ! lib/Regexp/Optimizer.pm + Support for (?|pattern) ! t/02-optimizer.t + more testing 0.22 2013/02/26 04:51:15 ! t/02-optimizer.t Since qr// happens compile time, SKIP: is not good enough to make older perls (which do not support named captures) to make them skip. 0.21 2013/02/26 04:51:15 ! lib/Regexp/Optimizer.pm t/02-optimizer.t + Clearner Codes + More Tests + More Documents 0.20 2013/02/23 13:43:59 ! * Completely rewritten. + uses Regexp::Assemble to optimize alteration + uses ??{CODE} to parse nested parens Thanks to these changes, this module is now much simpler. 0.16 2013/02/20 17:54:59 ! lib/Regexp/Optimizer.pm lib/Regexp/List.pm Marked obsolete. Use Regexp::Assemble instead. 0.15 2004/12/05 16:07:34 ! lib/Regexp/Optimizer.pm lib/Regexp/List.pm Pod fixed accordingly to RT # 8733 http://rt.cpan.org/NoAuth/Bug.html?id=8733 0.14 2004/11/05 12:44:48 ! t/02-list.t Addressed test failure that was raised by Perl 5.8.5 http://rt.cpan.org/NoAuth/Bug.html?id=8165 0.13 2004/05/08 05:55:35 ! lib/Regexp/Optimizer.pm Document bug, reported by Frederick in the mail below, fixed. ! Makefile.PL ! Regexp/ -> lib/Regexp/ Module hierarchy realigned so cygwin is happy. Reported by Frederick Weiland <30F579C44E1598429167C5E20E307BA19D807D@atlmsg01.raremedium.net> 0.12 2004/05/04 17:12:14 ! Regexp/Optimizer.pm Perl 5.8.4 and later corrected a bug so it is fatal to go my $x = qr{ ... (??{ $x }) ... }; under "use strict". Unfortunately Regexp/Optimizer.pm had two occurances thereof. Now fixed. 0.11 2004/05/03 15:09:14 ! Regexp/List.pm Deep recursion addressed and corrected by yoz@yoz.com https://rt.cpan.org/Ticket/Display.html?id=4937 0.10 2003/06/02 20:11:07 ! * Version jumps to 0.10 do to radical changes, especially in $Regexp::Optimize. ! Regexp/Optimizer.pm t/02-list.t t/03-utf8.t Aagh! Why didn't I come up with such a simple idea !? ->optimize is completely rewritten. t/*.t is streamlined to reflect changes. Now ->optimize() *really* optimizes even for qr/(?:1|12)|123/. Used to return qr/1(?:2?|23)/ but now it returns qr/1(?:23?)?/o ! Regexp/Optimizer.pm Regexp/List.pm POD enhanced. 0.02 2003/06/01 00:11:26 ! Regexp/Optimizer.pm Regexp/List.pm * Lots of bug fixes regarding $o->optimize() for nested parens ! t/02-list.t Test data for #21 corrected ! t/03-utf8.t buggy SKIP: sections fixed. 0.01 2003/05/31 10:44:41 + * 0th release Regexp-Optimizer-0.23/lib/000755 000765 000024 00000000000 12113046242 016331 5ustar00dankogaistaff000000 000000 Regexp-Optimizer-0.23/Makefile.PL000644 000765 000024 00000001404 12113046203 017531 0ustar00dankogaistaff000000 000000 # # $Id: Makefile.PL,v 0.20 2013/02/23 13:43:59 dankogai Exp $ # use 5.008001; use strict; use warnings FATAL => 'all'; use ExtUtils::MakeMaker; WriteMakefile( NAME => 'Regexp::Optimizer', AUTHOR => q{Dan Kogai }, VERSION_FROM => 'lib/Regexp/Optimizer.pm', ABSTRACT_FROM => 'lib/Regexp/Optimizer.pm', LICENSE => 'Artistic_2_0', PL_FILES => {}, MIN_PERL_VERSION => 5.008001, CONFIGURE_REQUIRES => { 'ExtUtils::MakeMaker' => 0, }, BUILD_REQUIRES => { 'Test::More' => 0, }, PREREQ_PM => { 'Regexp::Assemble' => 0, }, dist => { COMPRESS => 'gzip -9f', SUFFIX => 'gz', }, clean => { FILES => 'Regexp-Optimizer-*' }, ); Regexp-Optimizer-0.23/MANIFEST000644 000765 000024 00000000533 12113046242 016715 0ustar00dankogaistaff000000 000000 Changes MANIFEST This list of files Makefile.PL README lib/Regexp/List.pm lib/Regexp/Optimizer.pm t/00-load.t t/01-list.t t/02-optimizer.t t/manifest.t t/pod-coverage.t t/pod.t META.yml Module YAML meta-data (added by MakeMaker) META.json Module JSON meta-data (added by MakeMaker) Regexp-Optimizer-0.23/META.json000644 000765 000024 00000001623 12113046242 017206 0ustar00dankogaistaff000000 000000 { "abstract" : "optimizes regular expressions", "author" : [ "Dan Kogai " ], "dynamic_config" : 1, "generated_by" : "ExtUtils::MakeMaker version 6.64, CPAN::Meta::Converter version 2.120921", "license" : [ "unknown" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : "2" }, "name" : "Regexp-Optimizer", "no_index" : { "directory" : [ "t", "inc" ] }, "prereqs" : { "build" : { "requires" : { "Test::More" : "0" } }, "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "runtime" : { "requires" : { "Regexp::Assemble" : "0", "perl" : "5.008001" } } }, "release_status" : "stable", "version" : "0.23" } Regexp-Optimizer-0.23/META.yml000644 000765 000024 00000001001 12113046242 017024 0ustar00dankogaistaff000000 000000 --- abstract: 'optimizes regular expressions' author: - 'Dan Kogai ' build_requires: Test::More: 0 configure_requires: ExtUtils::MakeMaker: 0 dynamic_config: 1 generated_by: 'ExtUtils::MakeMaker version 6.64, CPAN::Meta::Converter version 2.120921' license: unknown meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: 1.4 name: Regexp-Optimizer no_index: directory: - t - inc requires: Regexp::Assemble: 0 perl: 5.008001 version: 0.23 Regexp-Optimizer-0.23/README000644 000765 000024 00000006272 12112033542 016450 0ustar00dankogaistaff000000 000000 Regexp-Optimizer The README is used to introduce the module and provide instructions on how to install the module, any machine dependencies it may have (for example C compilers and installed libraries) and any other information that should be provided before the module is installed. A README file is required for CPAN modules since CPAN extracts the README file from a module distribution so that people browsing the archive can use it to get an idea of the module's uses. It is usually a good idea to provide version information here so that people can decide whether fixes for the module are worth downloading. INSTALLATION To install this module, run the following commands: perl Makefile.PL make make test make install SUPPORT AND DOCUMENTATION After installing, you can find documentation for this module with the perldoc command. perldoc Regexp::Optimizer You can also look for information at: RT, CPAN's request tracker (report bugs here) http://rt.cpan.org/NoAuth/Bugs.html?Dist=Regexp-Optimizer AnnoCPAN, Annotated CPAN documentation http://annocpan.org/dist/Regexp-Optimizer CPAN Ratings http://cpanratings.perl.org/d/Regexp-Optimizer Search CPAN http://search.cpan.org/dist/Regexp-Optimizer/ LICENSE AND COPYRIGHT Copyright (C) 2013 Dan Kogai This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at: L Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license. If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license. This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder. This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed. Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Regexp-Optimizer-0.23/t/000755 000765 000024 00000000000 12113046242 016026 5ustar00dankogaistaff000000 000000 Regexp-Optimizer-0.23/t/00-load.t000644 000765 000024 00000000365 12112033542 017351 0ustar00dankogaistaff000000 000000 #!perl -T use 5.006; use strict; use warnings FATAL => 'all'; use Test::More; plan tests => 1; BEGIN { use_ok( 'Regexp::Optimizer' ) || print "Bail out!\n"; } diag( "Testing Regexp::Optimizer $Regexp::Optimizer::VERSION, Perl $], $^X" ); Regexp-Optimizer-0.23/t/01-list.t000644 000765 000024 00000000451 12112143265 017406 0ustar00dankogaistaff000000 000000 #!perl -T use 5.006; use strict; use warnings FATAL => 'all'; use Regexp::List; use Test::More; plan tests => 1; my $rl = Regexp::List->new(); my $ra = Regexp::Assemble->new(); my @list = ( 'ab+c', 'ab+-', 'a\w\d+', 'a\d+' ); $ra->add(@list); is $rl->list2re(@list), $ra->re, 'Regexp::Assemble'; Regexp-Optimizer-0.23/t/02-optimizer.t000644 000765 000024 00000002767 12113045010 020460 0ustar00dankogaistaff000000 000000 #!perl -T use 5.008001; use strict; use warnings FATAL => 'all'; use Regexp::Optimizer; use Test::More; plan tests => 10; my $ro = Regexp::Optimizer->new(); my $ra = Regexp::Assemble->new->add(qw/foobar fooxar foozap/)->re; is $ro->as_string(qr/foobar|fooxar|foozap/), $ra, $ra; my $re_verbose = qr{ foobar | # comment fooxar # | # foozap }msx; is $ro->as_string($re_verbose), qr/foo(?:[bx]ar|zap)/msx, "qr//msx"; # Not idempotent # is $ro->as_string($ra), $ra, $ra; my $re_noneed = qr/no(alteration(in(the(expression))))/; is $ro->optimize($re_noneed), $re_noneed, 'Already Optimzed'; my $re_escaped = qr/(\(|a|b|c|\))/; is $ro->as_string($re_escaped), qr/([()abc])/, 'Escaped'; my $re_nested = qr/f(?:oo(?:l|lish|lishness)?)/; is $ro->as_string($re_nested), qr/f(?:oo(?:l(?:ish(?:ness)?)?)?)/, 'Nested'; SKIP: { skip "Perl v5.14 or better required", 5 unless $] >= 5.010; eval q{ my $re_named = qr/(?a|b|c)/; is $ro->as_string($re_named), qr/(?[abc])/, "Named: $re_named"; $re_named = qr/(?'abc'a|b|c)/; is $ro->as_string($re_named), qr/(?'abc'[abc])/, "Named: $re_named"; my $re_brset = qr/(?|foo|fool)/; is $ro->as_string($re_brset), qr/(?|fool?)/, "Branch Reset"; for my $str ( qw{ (??{0|1}) (?(?=bar|foo)foo|bar) } ) { use re 'eval'; my $re = qr{$str}; is $ro->as_string($re), $re, "Code: $re"; } }; } Regexp-Optimizer-0.23/t/manifest.t000644 000765 000024 00000000507 12112033542 020021 0ustar00dankogaistaff000000 000000 #!perl -T use 5.006; use strict; use warnings FATAL => 'all'; use Test::More; unless ( $ENV{RELEASE_TESTING} ) { plan( skip_all => "Author tests not required for installation" ); } my $min_tcm = 0.9; eval "use Test::CheckManifest $min_tcm"; plan skip_all => "Test::CheckManifest $min_tcm required" if $@; ok_manifest(); Regexp-Optimizer-0.23/t/pod-coverage.t000644 000765 000024 00000001113 12112033542 020560 0ustar00dankogaistaff000000 000000 #!perl -T use 5.006; use strict; use warnings FATAL => 'all'; use Test::More; # Ensure a recent version of Test::Pod::Coverage my $min_tpc = 1.08; eval "use Test::Pod::Coverage $min_tpc"; plan skip_all => "Test::Pod::Coverage $min_tpc required for testing POD coverage" if $@; # Test::Pod::Coverage doesn't require a minimum Pod::Coverage version, # but older versions don't recognize some common documentation styles my $min_pc = 0.18; eval "use Pod::Coverage $min_pc"; plan skip_all => "Pod::Coverage $min_pc required for testing POD coverage" if $@; all_pod_coverage_ok(); Regexp-Optimizer-0.23/t/pod.t000644 000765 000024 00000000401 12112033542 016766 0ustar00dankogaistaff000000 000000 #!perl -T use 5.006; use strict; use warnings FATAL => 'all'; use Test::More; # Ensure a recent version of Test::Pod my $min_tp = 1.22; eval "use Test::Pod $min_tp"; plan skip_all => "Test::Pod $min_tp required for testing POD" if $@; all_pod_files_ok(); Regexp-Optimizer-0.23/lib/Regexp/000755 000765 000024 00000000000 12113046242 017563 5ustar00dankogaistaff000000 000000 Regexp-Optimizer-0.23/lib/Regexp/List.pm000644 000765 000024 00000007454 12113046203 021043 0ustar00dankogaistaff000000 000000 package Regexp::List; use 5.008001; use strict; use warnings FATAL => 'all'; use Regexp::Assemble; our $VERSION = sprintf "%d.%02d", q$Revision: 0.20 $ =~ /(\d+)/g; sub new { bless \my $dummy, shift; }; sub list2re { my $self = shift; Regexp::Assemble->new->add(@_)->re; }; 1; # End of Regexp::List __END__ =head1 NAME Regexp::List - Assemble multiple Regular Expressions into a single RE =head1 VERSION $Id: List.pm,v 0.20 2013/02/23 13:43:59 dankogai Exp $ =head1 DEPRECATED use L instead. =head1 SYNOPSIS use Regexp::List; my $rl = Regexp::List->new(); my @list = ( 'ab+c', 'ab+-', 'a\w\d+', 'a\d+' ); print $rl->list2re(@list); # Regexp::Asssemble->new->add(@list); =head1 DESCRIPTION This module exists just for the sake of compatibility w/ version 0.16 and below. =over 2 =item new Just a stub. =item list2re Simply does: Regexp::Asssemble->new->add(@list); =back =head1 SEE ALSO L, L =head1 AUTHOR Dan Kogai, C<< >> =head1 BUGS Please report any bugs or feature requests to C, or through the web interface at L. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. =head1 SUPPORT You can find documentation for this module with the perldoc command. perldoc Regexp::Optimizer You can also look for information at: =over 4 =item * RT: CPAN's request tracker (report bugs here) L =item * AnnoCPAN: Annotated CPAN documentation L =item * CPAN Ratings L =item * Search CPAN L =back =head1 LICENSE AND COPYRIGHT Copyright 2013 Dan Kogai. This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at: L Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license. If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license. This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder. This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed. Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. =cut Regexp-Optimizer-0.23/lib/Regexp/Optimizer.pm000644 000765 000024 00000014521 12113046203 022103 0ustar00dankogaistaff000000 000000 package Regexp::Optimizer; use 5.008001; use strict; use warnings FATAL => 'all'; use Regexp::Assemble; our $VERSION = sprintf "%d.%02d", q$Revision: 0.23 $ =~ /(\d+)/g; my $re_nested; $re_nested = qr{ \( # open paren ((?: # start capture (?>[^()]+) | # Non-parens w/o backtracking or ... (??{ $re_nested }) # Group with matching parens )*) # end capture \) # close paren }msx; my $re_optimize = qr{(?<=[^\\])\|}ms; sub new { my $class = shift; bless {@_}, $class; } sub _assemble { my $str = shift; return $str if $str !~ $re_optimize; if ( $str !~ m/[(]/ms ) { my $ra = Regexp::Assemble->new(); $ra->add( split m{[|]}, $str ); return $ra->as_string; } $str =~ s{$re_nested}{ no warnings 'uninitialized'; my $sub = $1; if ($sub =~ m/\A\?(?:[\?\{\(PR]|[\+\-]?[0-9])/ms) { "($sub)"; # (?{CODE}) and like ruled out }else{ my $mod = ($sub =~ s/\A\?//) ? '?' : ''; if ($mod) { $sub =~ s{\A( [\w\^\-]*: | # modifier [<]?[=!] | # assertions [<]\w+[>] | # named capture [']\w+['] | # ditto [|] # branch reset ) }{}msx; $mod .= $1; } '(' . $mod . _assemble($sub) . ')' } }msxge; $str; } sub as_string { my ( $self, $str ) = @_; return $str if $str !~ $re_optimize; my ($mod) = ($str =~ m/\A\(\?(.*?):/); if ( $mod =~ /x/ ) { $str =~ s{^\s+}{}mg; $str =~ s{(?<=[^\\])\s*?#.*?$}{}mg; $str =~ s{\s+[|]\s+}{|}mg; $str =~ s{(?:\r\n?|\n)}{}msg; $str =~ s{[ ]+}{ }msgx; # warn $str; } # escape all occurance of '\(' and '\)' $str =~ s/\\([\(\)])/sprintf "\\x%02x" , ord $1/ge; _assemble($str); } sub optimize { my $self = shift; my $re = $self->as_string(shift); qr{$re}; } 1; # End of Regexp::Optimizer __END__ =head1 NAME Regexp::Optimizer - optimizes regular expressions =head1 VERSION $Id: Optimizer.pm,v 0.23 2013/02/26 05:47:41 dankogai Exp dankogai $ =head1 SYNOPSIS use Regexp::Optimizer; my $o = Regexp::Optimizer->new->optimize(qr/foobar|fooxar|foozap/); # $re is now qr/foo(?:[bx]ar|zap)/ =head1 EXPORT none. =head1 SUBROUTINES/METHODS =head2 new Makes a new optimizer instance. my $ro = Regexp::Optimizer->new; =head2 optimize Does the optimization. my $re = qr/foobar|fooxar|foozap/; $re = $ro->optimize($re); If it is already optimized -- no alteration in the regexp, it is practically an identity function which simply returns an argument. If not, it dissasembles the regexp, feeds it to L, and reassembles the result. =head2 as_string Same as C but returns a string instead of regexp object. =head1 CAVEAT =head2 ??{CODE} used This module depends on the C regexp construct which is still considered experimental as of Perl 5.16. =head2 not idempotent If you feed the regexp that is already optimized, the resulting regexp may not necessarily the same -- usually you get duplicate C<(?:)>: my $re = qr/foobar|fooxar|foozap/; $re = $ro->optimize($re); # qr/foo(?:[bx]ar|zap)/ $re = $ro->optimize($re); # qr/foo(?:(?:[bx]ar|zap))/ =head1 SEE ALSO L, L =head1 AUTHOR Dan Kogai, C<< >> =head1 BUGS Please report any bugs or feature requests to C, or through the web interface at L. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. =head1 SUPPORT You can find documentation for this module with the perldoc command. perldoc Regexp::Optimizer You can also look for information at: =over 4 =item * RT: CPAN's request tracker (report bugs here) L =item * AnnoCPAN: Annotated CPAN documentation L =item * CPAN Ratings L =item * Search CPAN L =back =head1 ACKNOWLEDGEMENTS =head1 LICENSE AND COPYRIGHT Copyright 2013 Dan Kogai. This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at: L Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license. If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license. This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder. This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed. Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. =cut