Regexp-Common-2016020301/000755 000765 000024 00000000000 12654450574 015341 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/Changes000644 000765 000024 00000026512 12654447134 016640 0ustar00abigailstaff000000 000000 Version 2016020301 Wed Feb 3 20:00:13 CET 2016 + Fixed POD issue (Hernan Lopes) Version 2016010801 Fri Jan 8 12:52:36 CET 2016 + Use 'done_testing ()' instead of 'done_testing', so the test file compiles even if the user uses an old Test::More. Version 2016010701 Thu Jan 7 19:31:34 CET 2016 + Removed the 29143 tests from t/number/integer.t, and replaced them with 34689 spread over 7 new files. The reason is, there are sporadic failures reported by CPAN testers in the t/number/integer.t file, but this file uses the random number generator to generate tests. Which makes issues very hard to debug, as failures cannot be reproduced. The new tests will be deterministic. Version 2015121601 Wed Dec 16 01:12:18 CET 2015 + Force Darwin 10.0 on Perl 5.10.0 to run square tests in 32 bit mode. Version 2013031302 + Doc typo fixes (Jonathan You /RT 67549) and (Gregor Herrmann /RT 71131). Version 2013031301 Wed Mar 13 12:03:41 CET 2013 + Pattern for IPv6 addresses. Requested by Guy Edwards and many others. RT 50693. Version 2013031201 Tue Mar 12 15:44:48 CET 2013 + Allow host/domain names to start with a digit, using $RE{net}{domain}{-rfc1101}. Requested by Guy Edwards and many others. RT 23626. Version 2013031101 Mon Mar 11 21:02:45 CET 2013 + For integers and decimal numbers (reals), allow the user to specify the pattern of the signs (leading sign, and for reals, the sign of the exponent). This gives the user the option to ask for a pattern that matches unsigned numbers (by specifying the empty string as the pattern). Requested by "Wilson, Jonathan" . Version 2013030901 Sat Mar 9 14:51:42 CET 2013 + Use (?-1) instead of (??{ }) for the recursive balanced pattern. This makes the pattern unavailable for pre-5.010 perls. Version 2011121001 Sat Dec 10 21:32:49 CET 2011 + Fixed a few cases where $[ was used instead of $]. (RT 73033 by Father Chrysostomos ) + Typo fix in docs (P Ramakers ) + Add an explicite 1 to end the main module; otherwise it may fail under Mason (Luciano Rocha ) Version 2011041801 Mon Apr 18 11:18:19 CEST 2011 + Created tests for Pascal comments. + https schemes are now included in $RE {URI} + Document non-anchoring (RT 55549) + Fix POD typos (RT 67549 by Salvatore Bonaccorso ) Version 2011041701 Sun Apr 17 14:22:58 CEST 2011 + Fix documentation (matches should be match). Reported by cogent@cpan.org in RT ticket 2833. + Added patterns for JavaDoc comments. (RT 13174 by chris+rt@chrisdolan.net) Version 2011041602 Sat Apr 16 17:06:28 CEST 2011 + Fix typo in repository (avar@cpan.org) Version 2011041601 Sat Apr 16 16:07:29 CEST 2011 + Fix bugs 61712 & 6940: $RE {num} {decimal} and $RE {num} {real} no longer match a single dot. Version 2010010201 Sat Jan 2 17:45:53 CET 2010 + Playing the CPANTS game + Adjustments for 5.005. + Add $VERSION to all submodules (RT #53250) Version 2009123001 Wed Dec 30 03:58:27 2009 +0100 + Added Changes file, removed them from README. + Remove CVS traces. + Modernize Makefile.PL + POD tests + Work around a Darwin 9.0 on 5.8.8 issue. (RT #43786) + Fix POD issues. (RT #34886) (RT #48974) + Fail faster on some URI subpatterns (RT #52309) + Added many people to the Acknowledgements section. Version 2.122 + Added various licences Version 2.121 + Removed spurious '=head1' from README. (Smylers) + Comments for 'm4', 'Modula-2', 'Modula-3', 'Ubercode', and 'ZZT-OOP'. + Fix for French zip codes; added Swiss zip codes (Rafael) Version 2.120 Wed Mar 16 01:39:57 CET 2005 + '-base', and '-places' options for $RE {num} {int}. '-group' can now take arguments for the form 'N,M'. + New file: t/number/integer.t + Comments for BML, INTERCAL (JP), and CQL. + Regexp::Common::SEN nits. + Many more tests. + ca. 223k tests in 56 files. Version 2.119 Sat Jan 1 17:57:01 CET 2005 + Changed -keep settings of $RE {zip} {US}. + Replaced occurrences of '\d' with '[0-9]' in number.pm and net.pm, because with Unicode more characters than just 0 - 9 will match '\d'. + Moved part of the tests of t/test_comment.t into different files in t/comment. + t/Common.pm now also tests OO and sub versions. + Fixed regex for J comments. + $VERSION nit in Regexp::Common::URI::RFC2384 + POD nit in Regexp::Common::_support (Mike Castle) + POD nit in Regexp::Common::comment. + 176048 tests in 53 files. Version 2.118 Wed Dec 15 00:16:09 CET 2004 + Introduced Regexp::Common::_support. + Disabled the '-prefix' option for Danish zip codes if the Perl version is older than 5.005_03, as they triggered bugs. 5.005_03 and more recent versions of Perl don't have this problem. + Fixed the generic OO routines. (They never worked correctly). + 134191 tests in 49 files. Version 2.117 Thu Jul 1 12:11:47 CEST 2004 + Fixed problems with squares and 32bit integer Perls. + Pod nits in Regexp::Common (Jim Cromie) + $VERSION fix in Regexp::Common::URI::RFCC2384 (Mike Arms) + Discuss unwanted matching in Regexp::Common::net (Charles Thomas) + 132238 tests in 48 files. Version 2.116 Wed Jun 30 11:37:45 CEST 2004 + Restricted recognition of squares to numbers less than 9000000000000000, to avoid problems with round-off errors. + Fixed an off-by-one error in t/zip/spain.t which caused some test to fail when they shouldn't. + 132235 tests in 48 files. Version 2.115 Wed Jun 9 23:59:13 2004 + Patterns for comments of ABC, Caml, CLU, COBOL, ECMAScript, Icon, J, JavaScript, Lisp, M, MUMPS. + Patterns for postal codes of Norway, Italy and Spain. + Patterns for US SSN. + New way of doing tests using t::Common, giving more control to test program. + Random generators in t::Common. + Cut down on tests in test_lingua_palindrome.t. + Fixed bug in t/zip/australia.t could cause an infinite loop. + 132225 tests in 48 files. Version 2.114 Sun May 25 21:34:56 2003 + Fix to t/zip/greenland.t to avoid generating valid zipcodes when testing for failures. + Fixes to t/URI/gopher.t and t/URI/wais.t to avoid warnings. + Australian postal codes now accept '0909'. (Ron Savage) + Added comments for 'C--', 'C#', 'Cg', 'Nickle', 'PL/SQL', 'QML' and SLIDE. + Fixed the assignment of the version number to $VERSION in 6 classes. + 158287 tests in 44 files. + 134 patterns in 11 classes and 12 subclasses. Version 2.113 Wed Apr 2 22:58:46 2003 + INCOMPATIBLE CHANGE! Regexp::Common used to set $; to '='. This no longer happens, because setting $; breaks Filter::Simple. (Report by Tim Maher). This means that regexps of the form $RE{foo}{"-flag=value"} no longer work! They need to be written as $RE{foo}{"-flag$;value"} or as $RE{foo}{-flag => "value"}. When defining patterns using the pattern function, a = still needs to be used to separate the flag from its default value. This has not been changed. We are very sorry for the inconvenience. + 157762 tests in 44 files. + 127 patterns in 11 classes and 12 subclasses. Version 2.112 Wed Mar 26 00:25:04 2003 + prospero and pop URIs + Fixed documentation of Regexp::Common::lingua to document the lingua stuff instead of Regexp::Common::zip (Murat). + 157761 tests in 44 files. + 127 patterns in 11 classes and 12 subclasses. Version 2.111 Wed Mar 12 23:30:03 2003 + Introduced the -i switch. Important if you are using the functional interface and want to be able to match case insensitive. (Request from Tim Maher). + Introduced the -nospace switch for $RE{net}{domain} (Request from Juerd). + Decimal numbers. + WAIS URIs. + More generic setup to define comments for various languages. + Expanded and redid the documentation for comment.pm. + Comments for Advisor, Advsys, Alan, Algol 60, Algol 68, B, BASIC (mvEnterprise), Forth, Fortran (both fixed and free form), fvwm2, mutt, Oberon, 6 versions of Pascal, PEARL (one of the at least four...), PL/B, PL/I, slrn, Squeak. + 151114 tests in 42 files. + 125 patterns in 11 classes and 10 subclasses. Version 2.110 Fri Feb 21 15:58:14 2003 + Fixed t/test_squares.t. '0x7FFFFFFFFFFFFFFF' cannot be used on non-64bit platforms - eval to the rescue. + Added $VERSION to the .pm files who were missing them. + gopher, news, nntp URIs. + 145013 tests in 39 files. + 100 patterns in 11 classes and 9 subclasses. Version 2.109 Mon Feb 10 22:41:45 2003 + Split out URI.pm into a myriad of files. There is a separate file for each URI scheme, and a separate file for each RFC that's used. URI.pm requires all the URI scheme files, and constructs the combining URI regexp. + file URIs. + Cut down on the number of tests run for certain classes. + 123810 tests in 36 files. + 95 patterns in 11 classes and 7 subclasses. Version 2.108 Sun Feb 9 22:58:56 2003 + Postal codes for Belgium, Denmark and Greenland. + Renamed the postal code patterns. + 194125 tests in 35 files. + 94 patterns in 11 classes. Version 2.107 Fri Feb 7 23:20:35 2003 + telnet URIs; Lua and FPL comments. + common code factored out - created t/Common.pm + 175984 tests in 32 files. + 91 patterns in 11 classes. Version 2.106 Sun Feb 2 18:42:08 2003 + Makefile.PL fixes for Windows. Version 2.105 Sun Feb 2 04:15:54 2003 (In remembrance of the Columbia crew) + Australian postal codes. + Reorganized t/ directory by adding subdirectories. + 88 patterns in 11 classes. + 163355 tests in 31 files. Version 2.104 Fri Jan 24 16:44:19 2003 + Forgot to add t/test_zip_german.t and t/test_zip_french.t to the MANIFEST file. Version 2.103 Thu Jan 23 03:21:17 2003 + Added German and French postal codes. + Fixed some bugs concerning HTTP URIs. + Complete remake of t/test_uri_http, with 15k+ tests. Extended testing exposed the now fixed bugs. + 87 patterns in 11 classes. + 156778 tests in 30 files. Version 1.30 Fri Jan 17 14:20:02 2003 + Fixed a bug concerning HTTP and FTP URIs. (reported by Hermann-Marcus Behrens) Version 1.29 Thu Jan 16 12:07:02 2003 + New since last release: Squares, Roman numbers, TV URIs, Palindromes, Dutch and US postal codes. + 131710 tests in 28 files. + 11 pattern classes. Version 1.20 Tue Aug 27 19:06:13 CEST 2002 + Balanced patterns can now take multiple sets of arbitrary strings as delimiters. + Fax URIs. + More comment patterns. Version 0.09 Tue Aug 6 16:44:57 CEST 2002 + Fixed $RE{URI}{tel}, local phone numbers can have future extensions as well. Version 0.08 Tue Aug 6 15:50:31 CEST 2002 + Added tel URI regexes. Version 0.07 Mon Aug 5 14:31:17 CEST 2002 + Fixed 'Regex::' and 'Rexexp::' typos. + Split t/test_uris.t into t/test_ftp_uri.t, t/test_http_uri.t and t/test_uris.t. Version 0.06 Mon Aug 5 00:56:19 CEST 2002 + URI regexes. Currently only HTTP and FTP. More to come. Version 0.05 Thu Aug 1 12:01:04 CEST 2002 + Improved the 'subs' method of MAC addresses. Version 0.04 Thu Aug 1 01:18:37 CEST 2002 + Added the set of $RE{net}{MAC} regexes, by request of Iain Truskett . + Required minimum Perl version for regexes for Haskell and Dylan comment, as they can be recursive. + Petdance suggested regexes for LOGO comments. Version 0.03 Wed Jul 31 15:21:11 CEST 2002 + Made the entire setup more modular, giving each set of patterns its own .pm file. Loading all goes via Regexp::Common though. + use strict; everywhere - it also runs under -W (localized). + Added comment regexes for many more languages (26 currently). + Fixed some small bugs. Version 0.02 Tue Jul 23 23:18:15 2002 + Added $RE{comment}{HTML} Version 0.01 Thu May 18 14:45:14 2000 + original version; created by h2xs 1.18 Regexp-Common-2016020301/COPYRIGHT000644 000765 000024 00000000626 12116413566 016631 0ustar00abigailstaff000000 000000 This software is Copyright (c) 2001 - 2008, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. Regexp-Common-2016020301/COPYRIGHT.AL000644 000765 000024 00000011605 12116413566 017123 0ustar00abigailstaff000000 000000 The Artistic License Preamble The intent of this document is to state the conditions under which a Package may be copied, such that the Copyright Holder maintains some semblance of artistic control over the development of the package, while giving the users of the package the right to use and distribute the Package in a more-or-less customary fashion, plus the right to make reasonable modifications. Definitions: * "Package" refers to the collection of files distributed by the Copyright Holder, and derivatives of that collection of files created through textual modification. * "Standard Version" refers to such a Package if it has not been modified, or has been modified in accordance with the wishes of the Copyright Holder. * "Copyright Holder" is whoever is named in the copyright or copyrights for the package. * "You" is you, if you're thinking about copying or distributing this Package. * "Reasonable copying fee" is whatever you can justify on the basis of media cost, duplication charges, time of people involved, and so on. (You will not be required to justify it to the Copyright Holder, but only to the computing community at large as a market that must bear the fee.) * "Freely Available" means that no fee is charged for the item itself, though there may be fees involved in handling the item. It also means that recipients of the item may redistribute it under the same conditions they received it. 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version. 3. You may otherwise modify your copy of this Package in any way, provided that you insert a prominent notice in each changed file stating how and when you changed that file, and provided that you do at least ONE of the following: a) place your modifications in the Public Domain or otherwise make them Freely Available, such as by posting said modifications to Usenet or an equivalent medium, or placing the modifications on a major archive site such as ftp.uu.net, or by allowing the Copyright Holder to include your modifications in the Standard Version of the Package. b) use the modified Package only within your corporation or organization. c) rename any non-standard executables so the names do not conflict with standard executables, which must also be provided, and provide a separate manual page for each non-standard executable that clearly documents how it differs from the Standard Version. d) make other distribution arrangements with the Copyright Holder. 4. You may distribute the programs of this Package in object code or executable form, provided that you do at least ONE of the following: a) distribute a Standard Version of the executables and library files, together with instructions (in the manual page or equivalent) on where to get the Standard Version. b) accompany the distribution with the machine-readable source of the Package with your modifications. c) accompany any non-standard executables with their corresponding Standard Version executables, giving the non-standard executables non-standard names, and clearly documenting the differences in manual pages (or equivalent), together with instructions on where to get the Standard Version. d) make other distribution arrangements with the Copyright Holder. 5. You may charge a reasonable copying fee for any distribution of this Package. You may charge any fee you choose for support of this Package. You may not charge a fee for this Package itself. However, you may distribute this Package in aggregate with other (possibly commercial) programs as part of a larger (possibly commercial) software distribution provided that you do not advertise this Package as a product of your own. 6. The scripts and library files supplied as input to or produced as output from the programs of this Package do not automatically fall under the copyright of this Package, but belong to whomever generated them, and may be sold commercially, and may be aggregated with this Package. 7. C or perl subroutines supplied by you and linked into this Package shall not be considered part of this Package. 8. The name of the Copyright Holder may not be used to endorse or promote products derived from this software without specific prior written permission. 9. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. The End Regexp-Common-2016020301/COPYRIGHT.AL2000644 000765 000024 00000021004 12116413566 017177 0ustar00abigailstaff000000 000000 Artistic License 2.0 Copyright (c) 2000-2006, The Perl Foundation. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble This license establishes the terms under which a given free software Package may be copied, modified, distributed, and/or redistributed. The intent is that the Copyright Holder maintains some artistic control over the development of that Package while still keeping the Package available as open source and free software. You are always permitted to make arrangements wholly outside of this license directly with the Copyright Holder of a given Package. If the terms of this license do not permit the full use that you propose to make of the Package, you should contact the Copyright Holder and seek a different licensing arrangement. Definitions "Copyright Holder" means the individual(s) or organization(s) named in the copyright notice for the entire Package. "Contributor" means any party that has contributed code or other material to the Package, in accordance with the Copyright Holder's procedures. "You" and "your" means any person who would like to copy, distribute, or modify the Package. "Package" means the collection of files distributed by the Copyright Holder, and derivatives of that collection and/or of those files. A given Package may consist of either the Standard Version, or a Modified Version. "Distribute" means providing a copy of the Package or making it accessible to anyone else, or in the case of a company or organization, to others outside of your company or organization. "Distributor Fee" means any fee that you charge for Distributing this Package or providing support for this Package to another party. It does not mean licensing fees. "Standard Version" refers to the Package if it has not been modified, or has been modified only in ways explicitly requested by the Copyright Holder. "Modified Version" means the Package, if it has been changed, and such changes were not explicitly requested by the Copyright Holder. "Original License" means this Artistic License as Distributed with the Standard Version of the Package, in its current version or as it may be modified by The Perl Foundation in the future. "Source" form means the source code, documentation source, and configuration files for the Package. "Compiled" form means the compiled bytecode, object code, binary, or any other form resulting from mechanical transformation or translation of the Source form. Permission for Use and Modification Without Distribution (1) You are permitted to use the Standard Version and create and use Modified Versions for any purpose without restriction, provided that you do not Distribute the Modified Version. Permissions for Redistribution of the Standard Version (2) You may Distribute verbatim copies of the Source form of the Standard Version of this Package in any medium without restriction, either gratis or for a Distributor Fee, provided that you duplicate all of the original copyright notices and associated disclaimers. At your discretion, such verbatim copies may or may not include a Compiled form of the Package. (3) You may apply any bug fixes, portability changes, and other modifications made available from the Copyright Holder. The resulting Package will still be considered the Standard Version, and as such will be subject to the Original License. Distribution of Modified Versions of the Package as Source (4) You may Distribute your Modified Version as Source (either gratis or for a Distributor Fee, and with or without a Compiled form of the Modified Version) provided that you clearly document how it differs from the Standard Version, including, but not limited to, documenting any non-standard features, executables, or modules, and provided that you do at least ONE of the following: (a) make the Modified Version available to the Copyright Holder of the Standard Version, under the Original License, so that the Copyright Holder may include your modifications in the Standard Version. (b) ensure that installation of your Modified Version does not prevent the user installing or running the Standard Version. In addition, the Modified Version must bear a name that is different from the name of the Standard Version. (c) allow anyone who receives a copy of the Modified Version to make the Source form of the Modified Version available to others under (i) the Original License or (ii) a license that permits the licensee to freely copy, modify and redistribute the Modified Version using the same licensing terms that apply to the copy that the licensee received, and requires that the Source form of the Modified Version, and of any works derived from it, be made freely available in that license fees are prohibited but Distributor Fees are allowed. Distribution of Compiled Forms of the Standard Version or Modified Versions without the Source (5) You may Distribute Compiled forms of the Standard Version without the Source, provided that you include complete instructions on how to get the Source of the Standard Version. Such instructions must be valid at the time of your distribution. If these instructions, at any time while you are carrying out such distribution, become invalid, you must provide new instructions on demand or cease further distribution. If you provide valid instructions or cease distribution within thirty days after you become aware that the instructions are invalid, then you do not forfeit any of your rights under this license. (6) You may Distribute a Modified Version in Compiled form without the Source, provided that you comply with Section 4 with respect to the Source of the Modified Version. Aggregating or Linking the Package (7) You may aggregate the Package (either the Standard Version or Modified Version) with other packages and Distribute the resulting aggregation provided that you do not charge a licensing fee for the Package. Distributor Fees are permitted, and licensing fees for other components in the aggregation are permitted. The terms of this license apply to the use and Distribution of the Standard or Modified Versions as included in the aggregation. (8) You are permitted to link Modified and Standard Versions with other works, to embed the Package in a larger work of your own, or to build stand-alone binary or bytecode versions of applications that include the Package, and Distribute the result without restriction, provided the result does not expose a direct interface to the Package. Items That are Not Considered Part of a Modified Version (9) Works (including, but not limited to, modules and scripts) that merely extend or make use of the Package, do not, by themselves, cause the Package to be a Modified Version. In addition, such works are not considered parts of the Package itself, and are not subject to the terms of this license. General Provisions (10) Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license. (11) If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license. (12) This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder. (13) This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed. (14) Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Regexp-Common-2016020301/COPYRIGHT.BSD000644 000765 000024 00000002740 12116413566 017237 0ustar00abigailstaff000000 000000 Copyright (c) 2001 - 2008, Damian Conway and Abigail All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * The names of its contributors may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Regexp-Common-2016020301/COPYRIGHT.MIT000644 000765 000024 00000002065 12116413566 017260 0ustar00abigailstaff000000 000000 Copyright (c) 2001 - 2008, Damian Conway and Abigail Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Regexp-Common-2016020301/lib/000755 000765 000024 00000000000 12654450573 016106 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/Makefile.PL000644 000765 000024 00000003054 12116413566 017306 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use 5.00473; use strict; use ExtUtils::MakeMaker; my @tests = qw [t/*.t t/*/*.t]; my %args = ( NAME => 'Regexp::Common', VERSION_FROM => 'lib/Regexp/Common.pm', ABSTRACT_FROM => 'lib/Regexp/Common.pm', PREREQ_PM => { 'vars' => 0, 'strict' => 0, }, MIN_PERL_VERSION => 5.00473, AUTHOR => 'Abigail ', LICENSE => 'mit', META_MERGE => { test_requires => { 'strict' => 0, }, resources => { repository => 'git://github.com/Abigail/Regexp--Common.git', }, keywords => ['regular expression', 'pattern'], }, test => { TESTS => $^O eq 'MSWin32' ? "@{[map {glob} @tests]}" : "@tests" }, ); $args {META_MERGE} {build_requires} ||= { 'ExtUtils::MakeMaker' => 0, %{$args {META_MERGE} {test_requires}} }; $args {META_MERGE} {configure_requires} ||= $args {META_MERGE} {build_requires}; my %filter = ( MIN_PERL_VERSION => '6.48', META_MERGE => '6.46', AUTHOR => '6.07', ABSTRACT_FROM => '6.07', LICENSE => '6.07', ); delete $args {$_} for grep {defined $filter {$_} && $ExtUtils::MakeMaker::VERSION lt $filter {$_}} keys %args; WriteMakefile %args; __END__ Regexp-Common-2016020301/MANIFEST000644 000765 000024 00000004541 12654450574 016476 0ustar00abigailstaff000000 000000 Changes COPYRIGHT COPYRIGHT.AL COPYRIGHT.AL2 COPYRIGHT.BSD COPYRIGHT.MIT lib/Regexp/Common.pm lib/Regexp/Common/_support.pm lib/Regexp/Common/balanced.pm lib/Regexp/Common/CC.pm lib/Regexp/Common/comment.pm lib/Regexp/Common/delimited.pm lib/Regexp/Common/lingua.pm lib/Regexp/Common/list.pm lib/Regexp/Common/net.pm lib/Regexp/Common/number.pm lib/Regexp/Common/profanity.pm lib/Regexp/Common/SEN.pm lib/Regexp/Common/URI.pm lib/Regexp/Common/URI/RFC1035.pm lib/Regexp/Common/URI/RFC1738.pm lib/Regexp/Common/URI/RFC1808.pm lib/Regexp/Common/URI/RFC2384.pm lib/Regexp/Common/URI/RFC2396.pm lib/Regexp/Common/URI/RFC2806.pm lib/Regexp/Common/URI/fax.pm lib/Regexp/Common/URI/file.pm lib/Regexp/Common/URI/ftp.pm lib/Regexp/Common/URI/gopher.pm lib/Regexp/Common/URI/http.pm lib/Regexp/Common/URI/news.pm lib/Regexp/Common/URI/pop.pm lib/Regexp/Common/URI/prospero.pm lib/Regexp/Common/URI/tel.pm lib/Regexp/Common/URI/telnet.pm lib/Regexp/Common/URI/tv.pm lib/Regexp/Common/URI/wais.pm lib/Regexp/Common/whitespace.pm lib/Regexp/Common/zip.pm MANIFEST Makefile.PL README TODO t/Common.pm t/comment/delimited.t t/comment/html.t t/comment/nested.t t/comment/pascal.t t/comment/single_line.t t/comment/single_or_multiline.t t/number/101_integer.t t/number/111_integer_base.t t/number/121_integer_places.t t/number/122_integer_places.t t/number/123_integer_places.t t/number/131_integer_sep.t t/number/141_integer_group.t t/number/decimal.t t/number/number.t t/SEN/usa_ssn.t t/test___luhn.t t/test_balanced.t t/test_bases.t t/test_bases_sep.t t/test_comments.t t/test_curry.t t/test_delimited.t t/test_domain.t t/test_i.t t/test_ip.t t/test_ipv6.t t/test_keep.t t/test_lingua_palindrome.t t/test_list.t t/test_mac.t t/test_no_import.t t/test_profanity.t t/test_roman.t t/test_squares.t t/test_sub.t t/test_sub_named.t t/test_ws.t t/URI/fax.t t/URI/file.t t/URI/ftp.t t/URI/gopher.t t/URI/http.t t/URI/news.t t/URI/nntp.t t/URI/pop.t t/URI/prospero.t t/URI/tel.t t/URI/telnet.t t/URI/tv.t t/URI/wais.t t/URI/any.t t/zip/australia.t t/zip/belgium.t t/zip/denmark.t t/zip/france.t t/zip/germany.t t/zip/greenland.t t/zip/netherlands.t t/zip/norway.t t/zip/italy.t t/zip/spain.t t/zip/us.t t/zip/zip.t t/zzz_50_pod.t t/zzz_60_pod_coverage.t META.yml Module YAML meta-data (added by MakeMaker) META.json Module JSON meta-data (added by MakeMaker) Regexp-Common-2016020301/META.json000644 000765 000024 00000002341 12654450574 016762 0ustar00abigailstaff000000 000000 { "abstract" : "Provide commonly requested regular expressions", "author" : [ "Abigail " ], "dynamic_config" : 1, "generated_by" : "ExtUtils::MakeMaker version 7.0401, CPAN::Meta::Converter version 2.150001", "keywords" : [ "regular expression", "pattern" ], "license" : [ "mit" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : "2" }, "name" : "Regexp-Common", "no_index" : { "directory" : [ "t", "inc" ] }, "prereqs" : { "build" : { "requires" : { "ExtUtils::MakeMaker" : "0", "strict" : "0" } }, "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0", "strict" : "0" } }, "runtime" : { "requires" : { "perl" : "5.00473", "strict" : "0", "vars" : "0" } } }, "release_status" : "stable", "resources" : { "repository" : { "url" : "git://github.com/Abigail/Regexp--Common.git" } }, "version" : "2016020301", "x_test_requires" : { "strict" : 0 } } Regexp-Common-2016020301/META.yml000644 000765 000024 00000001325 12654450573 016612 0ustar00abigailstaff000000 000000 --- abstract: 'Provide commonly requested regular expressions' author: - 'Abigail ' build_requires: ExtUtils::MakeMaker: '0' strict: '0' configure_requires: ExtUtils::MakeMaker: '0' strict: '0' dynamic_config: 1 generated_by: 'ExtUtils::MakeMaker version 7.0401, CPAN::Meta::Converter version 2.150001' keywords: - 'regular expression' - pattern license: mit meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: '1.4' name: Regexp-Common no_index: directory: - t - inc requires: perl: '5.00473' strict: '0' vars: '0' resources: repository: git://github.com/Abigail/Regexp--Common.git version: '2016020301' x_test_requires: strict: 0 Regexp-Common-2016020301/README000644 000765 000024 00000007552 12654447171 016231 0ustar00abigailstaff000000 000000 ============================================================================== Release of version 2016020301 of Regexp::Common ============================================================================== The main reason for version 2.122 is a change in the license. You now have the option to use this software under either the original Artistic License, the Artistic License 2.0, the MIT license, or the BSD license. WARNINGS: As of version 2013030901, $RE {balanced} is no longer supported for pre-5.10 Perls. INCOMPATIBLE CHANGE in version 2.119: The $N settings for the -keep option of US postal codes ($RE {zip} {US} {-keep}) have been changed. See the Regexp::Common::zip for details. INCOMPATIBLE CHANGE in version 2.113: Regexp::Common used to set $; to '='. This no longer happens, because setting $; breaks Filter::Simple. This means that regexps of the form $RE{foo}{"-flag=value"} no longer work! They need to be written as $RE{foo}{"-flag$;value"} or as $RE{foo}{-flag => "value"}. When defining patterns using the pattern function, a = still needs to be used to separate the flag from its default value. This has not been changed. We are very sorry for the inconvenience. NAME Regexp::Common - Provide commonly requested regular expressions SYNOPSIS use Regexp::Common; while (<>) { /$RE{num}{real}/ and print q{a number\n}; /$RE{quoted}/ and print q{a ['"`] quoted string\n}; /$RE{delimited}{-delim=>'/'}/ and print q{a /.../ sequence\n}; /$RE{balanced}{-parens=>'()'}/ and print q{balanced parentheses\n}; /$RE{profanity}/ and print q{a #*@%-ing word\n}; } DESCRIPTION By default, this module exports a single hash (`%RE') that stores or generates commonly needed regular expressions. Patterns currently provided include: * balanced parentheses and brackets * delimited text (with escapes) * integers and floating-point numbers in any base (up to 36) * comments in 44 languages * offensive language * lists of any pattern * IPv4 addresses * URIs. * Zip codes. Future releases of the module will also provide patterns for the following: * email addresses * HTML/XML tags * mail headers (including multiline ones), * more URIs * telephone numbers of various countries * currency (universal 3 letter format, Latin-1, currency names) * dates * binary formats (e.g. UUencoded, MIMEd) * Credit card numbers. INSTALLATION It's all pure Perl, so just put the .pm files in their appropriate local Perl subdirectory. Alternatively, use the common approach: - untar the archive - run: perl Makefile.PL - run: make - run: make test - run: make install AUTHORS Damian Conway (damian@cs.monash.edu.au) and Abigail (regexp-common@abigail.be) MAINTAINER Abigail (regexp-common@abigail.be) COPYRIGHT and LICENSE This software is Copyright (c) 2001 - 2013, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licences: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. ============================================================================== AVAILABILITY Regexp::Common has been uploaded to the CPAN and is also available from: http://github.com/Abigail/Regexp--Common.git ============================================================================== Regexp-Common-2016020301/t/000755 000765 000024 00000000000 12654450573 015603 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/TODO000644 000765 000024 00000000664 12116413566 016030 0ustar00abigailstaff000000 000000 - URIs: + As defined in RFC 1738. + More of them. - Dates: + localtime dates. + ISO Dates. + An inverse of strftime? - numbers: + Decimal numbers (e.g. 7.5, 0.3, .99, 15, but not 1.23E5). + Roman numbers >= 4000. Unicode? + Prime numbers? Fibonacci? Other special numbers? + Ranges of numbers. - postal codes. + Lots more, especially British and Canadians. - Email addresses. + RFC 822/2822. Regexp-Common-2016020301/t/comment/000755 000765 000024 00000000000 12654450573 017245 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/t/Common.pm000644 000765 000024 00000067612 12116413566 017377 0ustar00abigailstaff000000 000000 package t::Common; use strict; use lib qw {blib/lib}; use vars qw /@ISA @EXPORT @EXPORT_OK $DEBUG/; use Regexp::Common; use Exporter (); use warnings; @ISA = qw /Exporter/; @EXPORT = qw /run_tests run_new_tests NORMAL_PASS NORMAL_FAIL FAIL $DEBUG/; @EXPORT_OK = qw /cross criss_cross pass fail d pd dd pdd l ll L LL a aa w ww _x xx X XX h hh gimme sample/; my @STATES = qw /pass fail/; our $SKIP; use constant NORMAL_PASS => 0x01; # Normal test, should pass. use constant NORMAL_FAIL => 0x02; # Normal test, should fail. use constant NORMAL => NORMAL_PASS | NORMAL_FAIL; use constant FAIL => 0x04; # Test for failure. sub run_test; sub run_old_keep; sub run_fail; sub count_me; sub is_skipped; my $count; sub stringify; sub stringify { my $arg = shift; if (!defined $arg) {return ""} elsif (!ref $arg) {$arg =~ s/\\/\\\\/g; $arg =~ s/\n/\\n/g; $arg =~ s/\t/\\t/g; return "$arg"} elsif ( ref $arg eq "ARRAY") { local $" = ", "; return "[@{[map {q{'} . stringify ($_) . q{'}} @$arg]}]"; } else {return ref $arg} } sub mess { my $str = stringify $_; my $com = join " " => map {stringify $_} @_; $count ++; if ($SKIP) {printf qq !%4d # SKIP: %s\n! => $count, $SKIP;} else {printf qq !%4d - %-40s (%s)\n! => $count, qq !"$str"!, $com;} } sub pass {print "ok "; &mess} sub fail {print +$SKIP ? "ok " : "not ok "; &mess} sub Fail { my $mess = shift; my %args = @_; if ($args {got} && $args {expected}) { printf "# Expected: '%s'\n", stringify $args {expected}; printf "# Got: '%s'\n", stringify $args {got}; } fail $mess; } sub import { if (@_ > 1 && $_ [-1] =~ /^\d+\.\d+$/) { my $version = pop; if ($version > $]) { print "1..1\n"; print "ok 1\n"; exit; } } __PACKAGE__ -> export_to_level (1, @_); } # # Return a cross product from its arguments. Arguments are array refs. # Result is a list of array refs. # sub cross { my @r = []; @r = map {my $s = $_; map {[@$_ => $s]} @r} @$_ for @_; @r } sub criss_cross { my ($f, $s) = @_; my @r; push @r => cross @$f [0 .. $_ - 1], $$s [$_], @$f [$_ + 1 .. $#$f] for 0 .. $#$f; @r; } sub __ {map {defined () ? $_ : "UNDEF"} @_} sub count_test_runs { my ($tests, $passes, $failures) = @_; my $keep = 0; my $normal = 0; my $fail = 0; foreach my $test (@$tests) { while (my ($name, $mask) = each %{$test -> [2]}) { $normal += @{$passes -> {$name}} if $mask & NORMAL; $keep += @{$passes -> {$name}} if $mask & NORMAL_PASS; $fail += @{$failures -> {$name}} if $mask & FAIL; } } 1 + $normal + $keep + $fail; } # Arguments: # tests: hash ref with the re's, names, and when to (not)match. # good: ref to array with arrays, parts making patterns. # bad: ref to array with arrays, parts not making patterns. # query: code ref, creates query strings. # wanted: code ref, creates list what keep should return. # # Filter arguments are used to filter chunks before trying them. # All of them are code refs. # filter: filter everything. # filter_passes: filter passes. # filter_failures: filter failures. # filter_test: filter called with testname. sub run_tests { my %args = @_; my $tests = $args {tests}; # Collect the names of all tags. my %tag_names; @tag_names {keys %{$_ -> [2]}} = () foreach @$tests; my (@passes, @failures); if ($args {good}) { @passes = cross @{$args {good}}; @failures = (); foreach my $i (0 .. $#{$args {good}}) { push @failures => cross @{$args {good}} [0 .. $i - 1], $args {bad} [$i], @{$args {good}} [$i + 1 .. $#{$args {good}}] } } elsif ($args {good_list}) { @passes = @{$args {good_list}}; } # General filters. @passes = grep {$args {filter_passes} -> ($_)} @passes if defined $args {filter_passes}; @passes = grep {$args {filter} -> ($_)} @passes if defined $args {filter}; @failures = grep {$args {filter_failures} -> ($_)} @failures if defined $args {filter_failures}; @failures = grep {$args {filter} -> ($_)} @failures if defined $args {filter}; my (%passes, %failures); # Specific filters. if (defined $args {filter_test}) { foreach my $name (keys %tag_names) { $passes {$name} = [grep {$args {filter_test} -> (pass => $name, $_)} @passes]; $failures {$name} = [grep {$args {filter_test} -> (failure => $name, $_)} @failures]; } } else { foreach my $name (keys %tag_names) { $passes {$name} = [@passes]; $failures {$name} = [@failures]; } } my $runs = count_test_runs $tests, \%passes, \%failures; print "1..$runs\n"; print "ok ", ++ $count, "\n"; my @test_names = map {$_ -> [1]} @$tests; my @tag_names = keys %tag_names; my $wanted = $args {wanted}; foreach my $test (@$tests) { my ($name, $re, $matches) = @$test; while (my ($tag, $match) = each %$matches) { if ($match & NORMAL) { foreach my $pass (@{$passes {$tag}}) { local $_ = $args {query} -> ($tag => $pass); run_test re => $re, name => $name, match => $match & NORMAL_PASS; run_old_keep re => $re, name => $name, tag => $tag, parts => $pass, wanted => $wanted if $match & NORMAL_PASS; } } if ($match & FAIL) { foreach my $failure (@{$failures {$tag}}) { local $_ = $args {query} -> ($tag => $failure); run_fail re => $re, name => $name; } } } } } sub run_test { my %args = @_; my $re = $args {re}; my $name = $args {name}; my $should_match = $args {match}; my $match = /^$re/; # Not anchored at the end on purpose. my $good = $match && $_ eq $&; my $line = $good ? "match" : $match ? "wrong match (got: $&)" : "no match"; $line .= "; $name"; if ($should_match) {$good ? pass $line : fail $line} else {$match ? fail $line : pass $line} } sub array_cmp { my ($a1, $a2) = @_; return 0 unless @$a1 eq @$a2; foreach my $i (0 .. $#$a1) { # !defined $$a1 [$i] && !defined $$a2 [$i] || # defined $$a1 [$i] && defined $$a2 [$i] && $$a1 [$i] eq $$a2 [$i] (!defined $$a1 [$i] || $$a1 [$i] eq "") && (!defined $$a2 [$i] || $$a2 [$i] eq "") || defined $$a1 [$i] && defined $$a2 [$i] && $$a1 [$i] eq $$a2 [$i] or return 0; } return 1; } sub run_old_keep { my %args = @_; my $re = $args {re}; # Regexp that's being tried. my $name = $args {name}; # Name of the test. my $tag = $args {tag}; # Tag to pass to wanted sub. my $parts = $args {parts}; # Parts to construct string from. my $wanted_sub = $args {wanted}; # Sub to contruct wanted array from. my @chunks = /^$re->{-keep}$/; unless (@chunks) {fail "no match; $name - keep"; return} my $wanted = $wanted_sub -> ($tag => $parts); local $" = ", "; array_cmp (\@chunks, $wanted) ? pass "match; $name - keep" : $DEBUG ? fail "wrong match,\n# got [@{[__ @chunks]}]\n" . "# expected [@{[__ @$wanted]}]" : fail "wrong match [@{[__ @chunks]}]" } ################## # # # New style subs # # # ################## # # Messages printed at end are of the form: # [XX/Y/ZZ], with XX denoting the type of match, Y the expected result, # and ZZ the result. # # XX: - RE: Regular expression # - SB: Subroutine call # - OM: OO -> match # - OS: OO -> subs # - KP: Regular expression with -keep # # Y: - P: Expected to pass # - F: Expected to fail # # ZZ: - MT: Pattern matched correctly # - NM: Pattern did not match # - WM: Pattern matched, but incorrectly. # # Given a regex and a string, test whether the regex fails to match. # Matching anything other than the entire string is a pass (as it regex # fails to match the entire string) # sub run_fail { my %args = @_; my $re = $args {re}; my $name = $args {name}; /^$re/ && $_ eq $& ? fail "[RE/F/MT] $name" : pass "[RE/F/NM] $name"; } # # Same as 'run_fail', except now not a regex, but a subroutine is given. # sub run_sub_fail { my %args = @_; my $sub = $args {sub}; my $name = $args {name}; my @args = $args {sub_args} ? ref $args {sub_args} ? @{$args {sub_args}} : $args {sub_args} : (); $_ =~ $sub -> (@args) && $_ eq $& ? fail "[SB/F/MT] $name" : pass "[SB/F/NM] $name"; } # # We can test whether it matched, but we can't really test whether # it matched the entire string. $& relates to the last successful # match in the current scope, but the match done in $re -> matches() # is done in a subscope. @-/@+ are equally useless. # sub run_OO_pass { my %args = @_; my $re = $args {re}; my $name = $args {name}; my $match = $re -> matches ($_); if ($match) {pass "[OM/P/MT] $name"} else {fail "[OM/P/NM] $name"} } # # Test whether the subroutine gives the right answer. # sub run_sub_pass { my %args = @_; my $sub = $args {sub}; my $name = $args {name}; my @args = $args {sub_args} ? ref $args {sub_args} ? @{$args {sub_args}} : $args {sub_args} : (); my $match = $_ =~ $sub -> (@args); my $good = $match && $_ eq $&; if ($good) {pass "[SB/P/MT] $name"} elsif ($match) {Fail "[SB/P/WM] $name", got => $&, expected => $_} else {fail "[SB/P/NM] $name"} } # # Check whether the substitution (only for OO) works correctly. # sub run_OO_substitution_pass { my %args = @_; my $re = $args {re}; my $name = $args {name}; my $token = $args {token} || "---"; my $sub = $re -> subs ($_, $token); my $good = $sub eq $token; if ($good) {pass "[OS/P/MT] $name"} elsif ($sub ne $_) {Fail "[OS/P/NM] $name", got => $sub, expected => $token} else {fail "[OS/P/WM] $name"} } sub run_pass { my %args = @_; my $re = $args {re}; my $name = $args {name}; my $match = /^$re/; # Not anchored at the end on purpose. my $good = $match && $_ eq $&; my $perfect = $good && !defined $1; # Should *not* set $1 and friends. if ($perfect) {pass "[RE/P/MT] $name"} elsif ($good) {fail "[RE/P/MT], sets \$1; $name"} elsif ($match) {Fail "[RE/P/WM] $name", got => $&, expected => $_} else {fail "[RE/P/NM] $name"} } sub run_keep { my %args = @_; my $re = $args {re}; # Regexp that's being tried. my $name = $args {name}; # Name of the test. my $wanted = $args {wanted}; # Wanted list. my @chunks = /^$re->{-keep}/; unless (@chunks) {fail "[KP/P/NM] $name"; return} array_cmp (\@chunks, $wanted) ? pass "[KP/P/MT] $name" : Fail "[KP/P/WM] $name", got => \@chunks, expected => $wanted; } sub get_args { my $key = shift; foreach my $ref (@_) { next unless exists $$ref {$key}; return ref $$ref {$key} eq 'ARRAY' ? @{$$ref {$key}} : $$ref {$key} } return; } sub run_new_test_set { my %args = @_; my $test_set = $args {test_set}; my $targets = $args {targets}; my $name = $$test_set {name}; my $regex = $$test_set {regex} || $$test_set {re}; # Getting tired of # getting this wrong. my $sub = $$test_set {sub}; my $sub_args = $$test_set {sub_args}; my $keep = $regex -> {-keep}; my $pass = $$test_set {pass}; my $fail = $$test_set {fail}; my $skip_sub = $$test_set {skip_sub}; # # Run the passes. # foreach my $target_info (@$pass) { my $target_name = $$target_info {name}; my $query = $$targets {$target_name} {query}; next unless $$targets {$target_name} {list} && @{$$targets {$target_name} {list}}; my $un_seen = @{$$targets {$target_name} {list}}; my $samples = count_me $$targets {$target_name} {list}, $$target_info {limit}, $$test_set {limit}; foreach my $parts (@{$$targets {$target_name} {list}}) { next unless $samples > rand $un_seen --; $samples --; # # Calculate the sections we're going to skip. # my %skips; foreach my $skip (qw /RE SB OO OM OS KP/) { $skips {$skip} = is_skipped $skip => $target_info, $test_set; } $skips {OM} ||= $skips {OO}; $skips {OS} ||= $skips {OO}; # # Find the thing we need to match against. # Note that we're going to match against $_. # my @args = ref $parts ? @$parts : $parts; my @qargs = get_args query_args => $target_info, $test_set; local $_ = $query ? $query -> (@qargs, @args) : ref $parts ? join "" => @$parts : $parts; # # See whether we want to skip the test # local $SKIP = $skip_sub && $skip_sub -> (pass => $_); # # Find out the things {-keep} should return. # The thing we match agains is in $_. # my @wanted; unless ($skips {KP}) { my @wargs = get_args wanted_args => $target_info, $test_set; my $w_sub = $$target_info {wanted} || $$targets {$target_name} {wanted}; @wanted = $w_sub ? $w_sub -> (@wargs, @args) : $_; if (@wanted == 1 && ref $wanted [0] eq "ARRAY") { @wanted = @{$wanted [0]}; } } run_pass name => $name, re => $regex unless $skips {RE}; run_OO_pass name => $name, re => $regex unless $skips {OM}; run_OO_substitution_pass name => $name, re => $regex unless $skips {OS}; run_sub_pass name => $name, sub_args => $sub_args, sub => $sub if $sub && !$skips {SB}; run_keep name => $name, re => $keep, wanted => \@wanted unless $skips {KP}; } } # # Run the failures. # foreach my $target_info (@$fail) { my $target_name = $$target_info {name}; my $query = $$targets {$target_name} {query}; next unless $$targets {$target_name} {list} && @{$$targets {$target_name} {list}}; my $un_seen = @{$$targets {$target_name} {list}}; my $samples = count_me $$targets {$target_name} {list}, $$target_info {limit}, $$test_set {limit}; foreach my $parts (@{$$targets {$target_name} {list}}) { next unless $samples > rand $un_seen --; $samples --; my @args = ref $parts ? @$parts : $parts; my @qargs = get_args query_args => $target_info, $test_set; local $_ = $query ? $query -> (@qargs, @args) : ref $parts ? join "" => @$parts : $parts; local $SKIP = $skip_sub && $skip_sub -> (fail => $_); my %skips; foreach my $skip (qw /RE SB/) { $skips {$skip} = is_skipped $skip => $target_info, $test_set; } run_fail name => $name, re => $regex unless $skips {RE}; run_sub_fail name => $name, sub_args => $sub_args, sub => $sub if $sub && !$skips {SB}; } } } # # If there's no list, or an empty list, 0 tests have to be run. # If no limits are given, return the size of the list. # Else, for the first defined limit, # if the limit is negative, return the size of the list, # else if the limit is 0, return 0, # else if the limit is less than 1, treat it as a fraction, # else, return the smaller of the limit and the size of the list. # sub count_me { my ($list, @limits) = @_; return 0 unless $list && @$list; foreach my $limit (@limits) { if (defined $limit) { return @$list if $limit < 0; return int (@$list * $limit) if $limit < 1; return $limit if $limit < @$list; return @$list; } } @$list; } # # Normify any 'pass','fail' and 'skip' entries in a test. # What we want is a 'pass' and a 'fail' pointing to an array of hashes, # each hash being a 'target'. # # Since we are passed a reference, the modification is done in situ. # sub normify { my $test = shift; foreach my $state (@STATES) { my @list; foreach my $postfix ("", "_arg") { my $key = "$state$postfix"; next unless exists $$test {$key}; my $targets = $$test {$key}; if (ref $targets eq 'ARRAY') { foreach my $thingy (@$targets) { if (ref $thingy eq 'HASH') { push @list => $thingy; } elsif (!ref $thingy) { push @list => {name => $thingy} } } } elsif (ref $targets eq 'HASH') { push @list => $targets; } else { push @list => {name => $targets}; } delete $$test {$key}; } $$test {$state} = \@list; } # # Skips. # if (!exists $$test {skip}) {$$test {skip} = {}} elsif (ref $$test {skip} eq 'ARRAY') { $$test {skip} = {map {$_ => 1} @{$$test {skip}}} } foreach my $state (@STATES) { foreach my $target (@{$$test {state}}) { if (!exists $$target {skip}) {$$target {skip} = {}} elsif (ref $$target {skip}) { $$target {skip} = {map {$_ => 1} @{$$target {skip}}} } } } } sub is_skipped { my ($type, @things) = @_; foreach my $thingy (@things) { return $$thingy {skip} {$type} if defined $$thingy {skip} {$type}; } return; } sub mult { my ($state, $has_sub, @things) = @_; my $mult; # Regular expression test. $mult ++ unless is_skipped RE => @things; # Subroutine check. $mult ++ if $has_sub && !is_skipped SB => @things; if ($state eq "pass") { # OO checks. $mult ++ unless is_skipped OO => @things or is_skipped OM => @things; $mult ++ unless is_skipped OO => @things or is_skipped OS => @things; # Keep check. $mult ++ unless is_skipped RE => @things or is_skipped KP => @things; } return $mult; } sub run_new_tests { my %args = @_; my ($tests, $targets, $version, $version_from, $extra_runs, $extra_runs_sub) = @args {qw /tests targets version version_from extra_runs extra_runs_sub/}; # # Modify any 'pass' and 'fail' entries to arrays of hashes. # foreach my $test (@$tests) { normify $test; } # # Count the number of runs. # my $runs = defined $version_from; # VERSION test. my $no_tests; if ($extra_runs) { $runs += $extra_runs; $count += $extra_runs; } if (defined $version && $version > $]) { $no_tests = 1; } else { # Count the tests to be run. foreach my $test (@$tests) { # Test: pass: regex, regex/keep, OO, OO-substitution, sub (if given) # fail: regex, sub (if given). my $has_sub = $$test {sub} ? 1 : 0; for my $state (@STATES) { foreach my $target (@{$$test {$state}}) { my $size = count_me $$targets {$$target {name}} {list}, $$target {limit}, $$test {limit}; $runs += $size * mult $state, $has_sub => $target, $test; } } } } print "1..$runs\n"; # Check whether a version is defined. if (defined $version_from) { print "ok ", ++ $count, "\n"; } if ($extra_runs_sub) { $extra_runs_sub -> (\$count) } unless ($no_tests) { foreach my $test (@$tests) { run_new_test_set test_set => $test, targets => $targets; } } } # # Function to produce random strings. # # Digit. sub d {int rand 10} # Positive digit. sub pd {1 + int rand 9} # String of digits. sub dd {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); join "" => map {d} 1 .. $min + int rand ($max - $min)} # String of digits, not all 0. sub pdd {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); TRY: my $dd = join "" => map {d} 1 .. $min + int rand ($max - $min); goto TRY unless $dd =~ /[^0]/; $dd} # Lowercase letter. sub l {chr (ord ('a') + int rand 26)} # String of lowercase letters. sub ll {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); join "" => map {l} 1 .. $min + int rand ($max - $min)} # Uppercase letter. sub L {chr (ord ('a') + int rand 26)} # String of uppercase letters. sub LL {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); join "" => map {L} 1 .. $min + int rand ($max - $min)} # Alpha. sub a {50 < rand (100) ? l : L} # String of alphas. sub aa {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); join "" => map {a} 1 .. $min + int rand ($max - $min)} # Alphanum. sub w {52 < rand (62) ? d : a} # String of alphanums. sub ww {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); join "" => map {w} 1 .. $min + int rand ($max - $min)} # Lowercase hex digit. sub _x {(0 .. 9, 'a' .. 'f') [int rand 16]} # String of lowercase hex digits. sub xx {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); join "" => map {_x} 1 .. $min + int rand ($max - $min)} # Uppercase hex digit. sub X {(0 .. 9, 'A' .. 'F') [int rand 16]} # String of uppercase hex digits. sub XX {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); join "" => map {X} 1 .. $min + int rand ($max - $min)} # Any case hex digit sub h {(0 .. 9, 'A' .. 'F', 'a' .. 'f') [int rand 22]} # String of anycase hex digits sub hh {my ($min, $max) = @_ > 1 ? (@_) : ($_ [0], $_ [0]); join "" => map {h} 1 .. $min + int rand ($max - $min)} # # Pass a number N and a callback C. Return N different results from C. # Will do at most 100 * N tries. # sub gimme { my ($count, $call) = @_; my %cache; foreach (1 .. 100 * $count) { $cache {$call -> ()} = 1; last if keys %cache >= $count; } keys %cache; } # # Given a number N, and a list of things, return a sample of N # sub sample { my $N = shift; return @_ if @_ <= $N; my @cache = splice @_ => 0, $N; my $count = $N; map {rand ++ $count < $N and splice @cache, rand @cache, 1, $_} @_; @cache; } 1; __END__ =head1 DESCRIPTION C is called with three (named) parameters: =over 4 =item C A references to an array of I (explained below). =item C A reference to a hash of I (explained below). =item C The name of the file that is checked for a version number. =back =head2 Targets Targets provide a set of strings to match against. Targets are indexed by name. Each target is a hash, with the following keys: =over 4 =item C Required. This is a reference to an array that will act as building blocks to build strings to match against. In the simplest form, this is just an array with strings - but typically, this is an array of arrays, each subarray used to create a string. =item C A coderef. For each entry in array given above, this coderef is called. It takes a set of arguments and returns a string to match against. If the corresponding entry in C is reference to an array, all its elements are passed - otherwise, the entry is passed as a whole. Extra arguments provided with C below are prepended. If no coderef is given, C is assumed. =item C A coderef. If the target is used for positive matches (that is, it's expected to match), this sub is called with the same arguments as C - except that C are prepended. It should return a list of strings as if the regular expression was called with C<{-keep}>. The string to match against may be assumed to be C<$_>. If no coderef is given, C is assumed. =back =head2 Tests The tests to run are put in an array, and run in that order. Each test tests a specific pattern. Up to seven types of tests are performed, depending whether the tests includes expected failures, expected passes or both. Expected passes are tested as a regular expression, as a regular expression with the C<{-keep}> option, as a subroutine, as an object using the C method, and as an object using the C method. Expected failures are tested as a regular expression, and as a subroutine. Each test is a hash with the following keys: =over 4 =item C The name of this test - mostly used in the test output. =item C The pattern to test with. =item C The subroutine to test with, if any. =item C Any arguments that need to be passed into the subroutine. If more than one argument needs to be passed, use a reference to an array - the array will be flattened when calling the subroutine. =item C Extra arguments to pass into the C coderef for all the targets belonging to this tests, if not overriden as discussed below. =item C Extra arguments to pass into the C coderef for all the targets belonging to this tests, if not overriden as discussed below. =item C Indicates which targets (discussed above) should be run with expected passes. The value of C is either a reference to an array - the array containing the names of the targets to run, or a reference to a hash. In the latter case, the keys are the targets to be run, while the keys are hash references, containing more configuration options for the target. Values allowed: =over 4 =item C Extra arguments to pass into the C coderef belonging to this test. See discussion above. =item C Extra arguments to pass into the C coderef belonging to this test. See discussion above. =back =item C As C, except that it will list targets with an expected failure. =back Regexp-Common-2016020301/t/number/000755 000765 000024 00000000000 12654450573 017073 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/t/SEN/000755 000765 000024 00000000000 12654450573 016230 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/t/test___luhn.t000755 000765 000024 00000002145 12116413566 020272 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use Regexp::Common::_support qw /luhn/; use warnings; my $TESTS = 100; my @good = qw /49927398716 00000000000/; my @bad = qw /49927398717 49927398715/; # Generate a bunch of valid, and invalid, numbers. my %cache; foreach (1 .. $TESTS) { my $length = 1 + int rand (1 > rand 10 ? 100 : 20); my $s = join "" => map {int rand 10} 1 .. $length; redo if $cache {$s} ++; my $even = 1; my $sum = 0; foreach my $n (split // => $s) { $n *= 2 if $even; $sum += ($n % 10) + int ($n / 10); $even = !$even; } my $c = $sum % 10 ? 10 - ($sum % 10) : 0; my $d = $c; $d = int rand 10 while $d == $c; my $g = reverse ($s) . $c; my $b = reverse ($s) . $d; push @good => $g; push @bad => $b; } my $total = @good + @bad; print "1..$total\n"; my $c = 0; foreach my $g (@good) { print "not " unless luhn $g; print "ok ", ++ $c, " # luhn ($g)\n"; } foreach my $b (@bad) { print "not " if luhn $b; print "ok ", ++ $c, " # !luhn ($b)\n"; } __END__ Regexp-Common-2016020301/t/test_balanced.t000755 000765 000024 00000006474 12116636016 020565 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; exit unless $] >= 5.010; # SIMPLE BALANCING ACT try $RE{balanced}; pass "()"; pass "(a)"; pass "(a b)"; pass "(a()b)"; pass "(a( )b)"; pass "(a(b))"; pass "(a(b)(c)(d(e)))"; pass "(a(])b)"; pass "(a({{{)b)"; fail "("; fail "(a"; fail "(a(b)"; fail "(a( b)"; fail "(a(]b)"; fail "(a({{{)b"; # MULTIPLE BALANCING ACT try $RE{balanced}{-parens=>"()[]"}; pass "()"; pass "(a)"; pass "(a b)"; pass "(a()b)"; pass "(a( )b)"; pass "(a(b))"; pass "(a(b)(c)(d(e)))"; pass "(a(})b)"; pass "(a([[()]])b)"; fail "("; fail "(a"; fail "(a(b)"; fail "(a( b)"; fail "(a(]b)"; fail "(a([[[)b"; try $RE{balanced}{-begin => 'begin'}{-end => 'end'}; pass 'begin end'; fail 'begin en'; fail 'begin nd'; pass 'begin begin end end'; pass 'beginend'; pass 'beginbeginbeginendendend'; pass 'begin begin end begin begin end begin end end end'; fail 'begin begin end begin egin end begin end end end'; fail 'begin end begin end'; try $RE{balanced}{-begin => 'start'}{-end => 'stop'}; pass 'start stop'; fail 'start st'; fail 'start op'; pass 'start start stop stop'; pass 'startstop'; pass 'startstartstartstopstopstop'; pass 'start start stop start start stop start stop stop stop'; fail 'start start stop start tart stop start stop stop stop'; fail 'start stop start stop'; try $RE{balanced}{-parens => '()[]'}{-begin => 'start'}{-end => 'stop'}; pass 'start stop'; fail 'start st'; fail 'start op'; pass 'start start stop stop'; pass 'startstop'; pass 'startstartstartstopstopstop'; pass 'start start stop start start stop start stop stop stop'; fail 'start start stop start tart stop start stop stop stop'; fail 'start stop start stop'; try $RE{balanced}{-begin => 'S'}{-end => 'T'}; pass 'S T'; fail 'S Q'; pass 'S S T T'; pass 'ST'; pass 'SSSTTT'; pass 'S S T S S T S T T T'; fail 'S S T S Q T S T T T'; fail 'S T S T'; try $RE{balanced}{-start => "(|["}{-end => ")|]"}; pass "()"; pass "(a)"; pass "(a b)"; pass "(a()b)"; pass "(a( )b)"; pass "(a(b))"; pass "(a(b)(c)(d(e)))"; pass "(a(})b)"; pass "(a([[()]])b)"; fail "("; fail "(a"; fail "(a(b)"; fail "(a( b)"; fail "(a(]b)"; fail "(a([[[)b"; # Test '|' delimiters. try $RE{balanced}{-begin => '\|'}{-end => '-'}; pass '| -'; fail '| Q'; pass '| | - -'; pass '|-'; pass '|||---'; pass '| | - | | - | - - -'; fail '| | - | Q - | - - -'; fail '| - | -'; try $RE{balanced}{-begin => '!'}{-end => '\|'}; pass '! |'; fail '! Q'; pass '! ! | |'; pass '!|'; pass '!!!|||'; pass '! ! | ! ! | ! | | |'; fail '! ! | ! Q | ! | | |'; fail '! | ! |'; try $RE{balanced}{-begin => '\||['} {-end => ')|]'}; pass "|)"; pass "|a)"; pass "|a b)"; pass "|a|)b)"; pass "|a| )b)"; pass "|a|b))"; pass "|a|b)|c)|d|e)))"; pass "|a|})b)"; pass "|a|[[|)]])b)"; fail "|"; fail "|a"; fail "|a|b)"; fail "|a| b)"; fail "|a|]b)"; fail "|a|[[[)b"; try $RE{balanced}{-begin => '(|['}{-end => ']'}; pass "(]"; pass "(a]"; pass "(a b]"; pass "(a(]b]"; pass "(a( ]b]"; pass "(a(b]]"; pass "(a(b](c](d(e]]]"; pass "(a(}]b]"; pass "(a([[(]]]]b]"; fail "("; fail "(a"; fail "(a(b]"; fail "(a( b]"; pass "(a(]b]"; fail "(a([[[]b"; Regexp-Common-2016020301/t/test_bases.t000755 000765 000024 00000012474 12116413566 020131 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; # TEST BASE 10 try $RE{num}{real}{-keep}; pass 0; pass 1; pass 12; pass 1234567; pass 1.23456789; pass '+1'; pass '+12'; pass '+1234567.89'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; pass '-1234567'; pass -1; pass -12; pass -1234567; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; fail "1,234,567"; fail "12,345.6789"; fail "1,2345,6789"; fail "1.2345.6789"; try $RE{num}{real}{-places => 2}; pass 0; pass 1; pass 12; pass 1234567; fail 1.23456789; pass 1.23; pass '+1'; pass '+12'; pass '+1234567.89'; pass '-1'; fail '-12.333333333333333333333333333333333333333'; pass '-1234567'; pass -1; pass -12; pass -1234567; fail 1.2 if $] > 5.00502; fail "a"; fail ""; fail "1a"; fail "- 1234"; fail "1,234,567"; fail "12,345.6789"; fail "1,2345,6789"; fail "1.2345.6789"; try $RE{num}{real}{-places => '3,8'}; pass 0; pass 1; pass 12; pass 1234567; pass 1.23456789; pass '+1'; pass '+12'; fail '+1234567.89'; pass '-1'; fail '-12.333333333333333333333333333333333333333'; pass '-1234567'; pass -1; pass -12; pass -1234567; fail 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; fail "1,234,567"; fail "12,345.6789"; fail "1,2345,6789"; fail "1.2345.6789"; try $RE{num}{dec}; pass 0; pass 1; pass 12; pass 1234567; pass 1.23456789; pass '+1'; pass '+12'; pass '+1234567.89'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; pass '-1234567'; pass -1; pass -12; pass -1234567; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; fail "1,234,567"; fail "12,345.6789"; fail "1,2345,6789"; fail "1.2345.6789"; # TEST BASE 2 try $RE{num}{real}{-base => 2}; pass 0; pass 1; fail 12; fail 1234567; fail 1.23456789; pass '+1'; fail '+12'; pass '+101010'; pass '+101010.0001010'; fail '+1234567.89'; pass '-1'; pass -1; fail "a"; fail ""; fail "1a"; fail "- 1010"; fail "1,001,101"; fail "1,010.1110"; fail "1,0101,0011"; fail "1.0011.0011"; try $RE{num}{bin}; pass 0; pass 1; fail 12; fail 1234567; fail 1.23456789; pass '+1'; fail '+12'; pass '+101010'; pass '+101010.0001010'; fail '+1234567.89'; pass '-1'; pass -1; fail "a"; fail ""; fail "1a"; fail "- 1010"; fail "1,001,101"; fail "1,010.1110"; fail "1,0101,0011"; fail "1.0011.0011"; # TEST BASE 8 try $RE{num}{real}{-base => 8}; pass 0; pass 1; pass 12; pass 1234567; fail 12345678; pass 1.23456; pass '+1'; pass '+12'; pass '+1234567.01'; fail '+1234567.09'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; pass '-1234567'; fail '-1234568'; pass -1; pass -12; pass -1234567; fail -1234568; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; fail "1,234,567"; fail "12,345.67"; fail "12,345.68"; fail "1,2345,5670"; fail "1.234.567"; try $RE{num}{oct}; pass 0; pass 1; pass 12; pass 1234567; fail 12345678; pass 1.23456; pass '+1'; pass '+12'; pass '+1234567.01'; fail '+1234567.09'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; pass '-1234567'; fail '-1234568'; pass -1; pass -12; pass -1234567; fail -1234568; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; fail "1,234,567"; fail "12,345.67"; fail "12,345.68"; fail "1,2345,5670"; fail "1.234.567"; # TEST BASE 16 try $RE{num}{real}{-base => 16}; pass 0; pass 1; pass 12; pass '12A4C67'; fail '12345678G'; pass '1.23A56'; fail '1.23Z56'; pass '+1'; pass '+12'; pass '+1234567.01A'; fail '+1234567.09Q'; pass '-1'; pass '-12.ddddddddddddddddddddddddddddddddddddddd'; pass '-123B4567'; fail '-123H4567'; pass -1; pass -12; pass -1234567; pass 1.2; pass "a"; fail ""; pass "1a"; pass "a1a"; pass "DeadBeef"; fail "LiveLamb"; fail "- 1234"; fail "1,abc,def"; fail "12,345.67A"; fail "12,3C5,68"; fail "1,23C5,5670"; fail "1.234.567"; try $RE{num}{hex}; pass 0; pass 1; pass 12; pass '12A4C67'; fail '12345678G'; pass '1.23A56'; fail '1.23Z56'; pass '+1'; pass '+12'; pass '+1234567.01A'; fail '+1234567.09Q'; pass '-1'; pass '-12.ddddddddddddddddddddddddddddddddddddddd'; pass '-123B4567'; fail '-123H4567'; pass -1; pass -12; pass -1234567; pass 1.2; pass "a"; fail ""; pass "1a"; pass "a1a"; pass "DeadBeef"; fail "LiveLamb"; fail "- 1234"; fail "1,abc,def"; fail "12,345.67A"; fail "12,3C5,68"; fail "1,23C5,5670"; fail "1.234.567"; # TEST BASE 34 try $RE{num}{real}{-base => 34}; pass 0; pass 1; pass 12; pass '12A4C67'; pass '12345678G'; pass '1.23A56'; fail '1.23Z56'; pass '+1'; pass '+12'; pass '+1234567.01A'; pass '+1234567.09Q'; pass '-1'; pass '-12.ddddddddddddddddddddddddddddddddddddddd'; pass '-123B4567'; pass '-123H4567'; pass -1; pass -12; pass -1234567; pass 1.2; pass "a"; fail ""; pass "1a"; pass "a1a"; pass "DeadBeef"; pass "LiveLamb"; fail "- 1234"; fail "1,abc,def"; fail "12,345.67A"; fail "12,3C5,68" if $] > 5.00502; # Regex bug fail "1,23C5,5670"; fail "1.234.567"; # TEST BASE 1 try $RE{num}{real}{-base => 1}; pass 0; pass "00000000000"; pass "00000.00000"; fail "00,000,000,000"; fail "00,000.000000"; fail 1; fail 12; fail '12A4C67'; fail '12345678G'; try $RE{num}{real}{-base => 1}{"-sep$;,"}; pass 0; fail "00000000000"; fail "00000.00000"; pass "00,000,000,000"; pass "00,000.000000"; fail 1; fail 12; fail '12A4C67'; fail '12345678G'; Regexp-Common-2016020301/t/test_bases_sep.t000755 000765 000024 00000012644 12116413566 020777 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; # TEST BASE 10 try $RE{num}{real}{-base => '10'}{-sep}; pass 0; pass 1; pass 12; fail 1234567; pass 1.23456789; pass 12.23456789; pass 123.23456789; fail 1234.23456789; pass '+1'; pass '+12'; fail '+1234567.89'; pass '+1,234,567.89'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; fail '-1234567'; pass -1; pass -12; fail -1234567; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; pass "1,234,567"; pass "12,345.6789"; fail "1,2345,6789"; fail "1.2345.6789"; # TRY WIERD ORDERING try $RE{-base => '10'}{num}{'-sep' => ' '}{real}; pass 0; pass 1; pass 12; fail 1234567; pass 1.23456789; pass 12.23456789; pass 123.23456789; fail 1234.23456789; pass '+1'; pass '+12'; fail '+1234567.89'; pass '+1 234 567.89'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; fail '-1234567'; pass -1; pass -12; fail -1234567; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; pass "1 234 567"; pass "12 345.6789"; fail "1 2345 6789"; fail "1.2345.6789"; # TRY FANCY FLAGS try $RE{-base=>10}{num}{-sep=>' '}{real}; pass 0; pass 1; pass 12; fail 1234567; pass 1.23456789; pass 12.23456789; pass 123.23456789; fail 1234.23456789; pass '+1'; pass '+12'; fail '+1234567.89'; pass '+1 234 567.89'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; fail '-1234567'; pass -1; pass -12; fail -1234567; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; pass "1 234 567"; pass "12 345.6789"; fail "1 2345 6789"; fail "1.2345.6789"; try $RE{num}{dec}{-sep}; pass 0; pass 1; pass 12; fail 1234567; pass 1.23456789; pass '+1'; pass '+12'; fail '+1234567.89'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; fail '-1234567'; pass -1; pass -12; fail -1234567; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; pass "1,234,567"; pass "12,345.6789"; fail "1,2345,6789"; fail "1.2345.6789"; # TEST BASE 2 try $RE{num}{real}{'-base' => '2'}{-sep}; pass 0; pass 1; fail 12; fail 1234567; fail 1.23456789; pass '+1'; fail '+12'; fail '+101010'; fail '+101010.0001010'; pass '+101,010.0001010'; fail '+1234567.89'; pass '-1'; pass -1; fail "a"; fail ""; fail "1a"; fail "- 1010"; pass "1,001,101"; pass "1,010.1110"; fail "1,0101,0011"; fail "1.0011.0011"; try $RE{num}{bin}{-sep}; pass 0; pass 1; fail 12; fail 1234567; fail 1.23456789; pass '+1'; fail '+12'; fail '+101010'; fail '+101010.0001010'; fail '+1234567.89'; pass '-1'; pass -1; fail "a"; fail ""; fail "1a"; fail "- 1010"; pass "1,001,101"; pass "1,010.1110"; fail "1,0101,0011"; fail "1.0011.0011"; # TEST BASE 8 try $RE{num}{real}{'-base' => '8'}{-sep}; pass 0; pass 1; pass 12; fail 1234567; fail 12345678; pass 1.23456; pass '+1'; pass '+12'; fail '+1234567.01'; fail '+1234567.09'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; fail '-1234567'; fail '-1234568'; pass -1; pass -12; fail -1234567; fail -1234568; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; pass "1,234,567"; pass "12,345.67"; fail "12,345.68"; fail "1,2345,5670"; fail "1.234.567"; try $RE{num}{oct}{-sep}; pass 0; pass 1; pass 12; fail 1234567; fail 12345678; pass 1.23456; pass '+1'; pass '+12'; fail '+1234567.01'; fail '+1234567.09'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; fail '-1234567'; fail '-1234568'; pass -1; pass -12; fail -1234567; fail -1234568; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; pass "1,234,567"; pass "12,345.67"; fail "12,345.68"; fail "1,2345,5670"; fail "1.234.567"; # TEST BASE 16 try $RE{num}{real}{'-base' => '16'}{-sep}; pass 0; pass 1; pass 12; fail '12A4C67'; fail '12345678G'; pass '1.23A56'; fail '1.23Z56'; pass '+1'; pass '+12'; fail '+1234567.01A'; fail '+1234567.09Q'; pass '-1'; pass '-12.ddddddddddddddddddddddddddddddddddddddd'; fail '-123B4567'; fail '-123H4567'; pass -1; pass -12; fail -1234567; pass 1.2; pass "a"; fail ""; pass "1a"; pass "a1a"; fail "DeadBeef"; pass "De,adB,eef"; fail "LiveLamb"; fail "- 1234"; pass "1,abc,def"; pass "12,345.67A"; fail "12,3C5,68" if $] > 5.00502; # Regex bug fail "1,23C5,5670"; fail "1.234.567"; try $RE{num}{hex}{-sep}; pass 0; pass 1; pass 12; fail '12A4C67'; fail '12345678G'; pass '1.23A56'; fail '1.23Z56'; pass '+1'; pass '+12'; fail '+1234567.01A'; fail '+1234567.09Q'; pass '-1'; pass '-12.ddddddddddddddddddddddddddddddddddddddd'; fail '-123B4567'; fail '-123H4567'; pass -1; pass -12; fail -1234567; pass 1.2; pass "a"; fail ""; pass "1a"; pass "a1a"; fail "DeadBeef"; fail "LiveLamb"; fail "- 1234"; pass "1,abc,def"; pass "12,345.67A"; fail "12,3C5,68" if $] > 5.00502; # Regex bug fail "1,23C5,5670"; fail "1.234.567"; # TEST BASE 34 try $RE{num}{real}{'-base' => '34'}{-sep}; pass 0; pass 1; pass 12; fail '12A4C67'; fail '12345678G'; pass '1.23A56'; fail '1.23Z56'; pass '+1'; pass '+12'; fail '+1234567.01A'; fail '+1234567.09Q'; pass '-1'; pass '-12.ddddddddddddddddddddddddddddddddddddddd'; fail '-123B4567'; fail '-123H4567'; pass -1; pass -12; fail -1234567; pass 1.2; pass "a"; fail ""; pass "1a"; pass "a1a"; fail "DeadBeef"; fail "LiveLamb"; pass "De,adB,eef"; pass "Li,veL,amb"; fail "- 1234"; pass "1,abc,def"; pass "12,345.67A"; fail "12,3C5,68" if $] > 5.00502; # Regex bug fail "1,23C5,5670"; fail "1.234.567"; Regexp-Common-2016020301/t/test_comments.t000755 000765 000024 00000017520 12116413566 020656 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} sub try2 {$P = qr /$_[0]$/} sub pass2 {ok ($S=($_[0] =~ $P && $& eq $_[1]))} sub fail2 {ok ($S=($_[0] !~ $P || $& ne $_[1]))} # LOAD use Regexp::Common; ok; my @ids = ( ['"' => [[Pascal => 'Workshop']]], ); my @from_to = ( [[[Pascal => 'Workshop']] => "/*", "*/"], [[qw /Pascal/, [Pascal => 'Workshop']] => "{", "}"], [[qw /Pascal/, [Pascal => 'Workshop']] => "(*", "*)"], [[qw /Pascal/] => "{", "*)"], [[qw /Pascal/] => "(*", "}"], ); foreach my $info (@ids) { my ($mark, $languages) = @$info; my $not_mark = $mark eq '#' ? '!' : '#'; foreach my $language (@$languages) { if (ref $language) { try $RE{comment}{$language -> [0]}{$language -> [1]}; $language = join ":" => @$language; } else { try $RE{comment}{$language}; } $M .= "# $language\n"; pass qq !${mark}${mark}!; pass qq !${mark}a comment${mark}!; pass qq !${mark}/*a comment */${mark}!; pass qq !${mark}/************${mark}!; pass qq !${mark}/////////////${mark}!; fail qq !${mark}a${mark}${mark}multiline${mark}${mark}comment${mark}!; fail qq !${mark}a comment!; fail qq !${mark}/*a comment */!; fail qq !${mark}/************!; fail qq !${mark}/////////////!; fail qq !${not_mark}${mark}!; fail qq !${not_mark}a comment${mark}!; fail qq !${not_mark}/*a comment */${mark}!; fail qq !${not_mark}/************${mark}!; fail qq !${not_mark}${mark}////////////${mark}!; fail qq !//a comment${mark}!; fail qq !///*a comment */${mark}!; fail qq !///************${mark}!; fail qq !///////////////${mark}!; fail qq !//a${mark}//multiline${mark}//comment${mark}!; fail qq !//a comment!; fail qq !///*a comment */!; fail qq !///************!; fail qq !///////////////!; next if $language eq 'Pascal:Workshop'; fail qq !/*a comment */!; fail qq !/************/!; fail qq !/*a${mark}multiline${mark}comment*/!; fail qq !/*a /*pretend*/ nested comment*/!; fail qq !/*a /*pretend*/!; } } foreach my $info (@from_to) { my ($languages, $from, $to) = @$info; my $f = substr $from => 0, 1; my $t = substr $to => 0, 1; foreach my $language (@$languages) { if (ref $language) { try $RE{comment}{$language -> [0]}{$language -> [1]}; $language = join ":" => @$language; } else { try $RE{comment}{$language}; } my $mark = $language eq 'Nickle' ? ';' : '#'; $M .= "# $language\n"; pass "${from}a comment ${to}"; my @str = ("${from}${t}${t}${t}${t}${t}${t}${t}${t}${t}${t}${to}", "${from}${t}${to}", ); if (${to} =~ /^(?:\Q${t}\E)+$/) {fail $_ for @str;} else {pass $_ for @str;} if ($language eq 'Pascal:Alice') { fail "${from}a\nmultiline\ncomment${to}"; } else { pass "${from}a\nmultiline\ncomment${to}"; } pass "${from}${to}"; fail "${from}a ${from}pretend${to} nested comment${to}"; pass "${from}a ${from}pretend${to}"; pass "${from} {) ${to}"; fail "${from}${t}${t}${t}${t}${t}${t}${t}${t}${t}${t}"; fail "${mark}\n"; fail "${mark}a comment\n"; fail "${mark}${from}a comment ${to}\n"; fail "${mark}${from}${t}${t}${t}${t}${t}${t}${t}${t}${t}${t}${t}\n"; fail "${mark}${f}${f}${f}${f}${f}${f}${f}${f}${f}${f}${f}${f}${f}\n"; fail "${mark}a\n${mark}multiline\n${mark}comment\n"; fail "${mark}a comment"; fail "${mark}${from}a comment ${to}"; fail "${mark}${from}${t}${t}${t}${t}${t}${t}${t}${t}${t}${t}${t}"; fail "${mark}${from}${t}${t}${t}${t}${t}${t}${t}${t}${t}${t}${to}"; fail "${mark}${f}${f}${f}${f}${f}${f}${f}${f}${f}${f}${f}${f}${f}"; } } try $RE{comment}{SQL}{MySQL}; $M .= "# SQL:MySQL\n"; pass "-- \n"; pass "-- a comment\n"; pass "-- /*a comment */\n"; pass "-- /************\n"; pass "-- /////////////\n"; pass "-- ---\n"; fail "--- --\n"; fail "--\n"; pass "-- ---/////////////\n"; fail "-- a\n-- multiline\n-- comment\n"; fail "-- a comment"; fail "-- /*a comment */"; fail "-- /************"; fail "-- /////////////"; pass "#\n"; pass "#a comment\n"; pass "#/*a comment */\n"; pass "#/************\n"; pass "#--////////////\n"; fail "//a comment\n"; fail "///*a comment */\n"; fail "///************\n"; fail "///////////////\n"; fail "//a\n//multiline\n//comment\n"; fail "//a comment"; fail "///*a comment */"; fail "///************"; fail "///////////////"; pass '/*a comment */'; pass '/************/'; pass "/*a\nmultiline\ncomment*/"; fail "/*a /*pretend*/ nested comment*/"; pass "/*a /*pretend*/"; fail "/***********"; pass "/* Comment ;"; fail "/* Comment ; */"; pass "/* Comment ';' */"; pass "/* Comment ';' ;"; pass '/* Comment ";" */'; pass '/* Comment ";" ;'; pass "/* Comment '\n;*/' */"; pass "/* Comment '*/' more comment */"; try $RE{comment}{Brainfuck}; $M .= "# Brainfuck\n"; pass "This is a comment"; pass " "; pass "\n"; pass "\x80\x90\xA0"; fail "[]"; fail "<"; fail "------"; fail "This is - a - comment"; try $RE{comment}{'Algol 68'}; $M .= "# Algol 68\n"; pass "# This is a comment #"; pass "co foo bar co"; pass "co co"; pass "co This is a comment co"; pass "comment This code isn't executed comment"; pass "comment\nMultiline\ncomment"; fail "######################"; fail "# This is not a comment\n"; fail "# # #"; fail "co co co"; fail "comment comment comment"; fail "# Wrong closer co"; fail "# Wrong closer comment"; fail "co foo bar baco"; fail " # foo #"; fail "# foo # "; try $RE{comment}{Squeak}; $M .= "# Squeak\n"; pass '"This is a comment"'; pass '"###########"'; pass '"//"'; pass '""'; pass '"Comment "" with "" double "" quotes"'; fail '#####'; fail '"Multiline"' . "\n" . '"comment"'; fail '"Comment'; fail '"Comment " comment"'; fail '"Comment """ comment"'; try2 $RE{comment}{Fortran}{fixed}; $M .= "# Fortran:fixed\n"; pass2 "!This is a comment\n", "!This is a comment\n"; pass2 "CThis is a comment\n", "CThis is a comment\n"; pass2 "cThis is a comment\n", "cThis is a comment\n"; pass2 "*This is a comment\n", "*This is a comment\n"; pass2 " !This is a comment\n", "!This is a comment\n"; fail " CThis is a comment\n"; fail " cThis is a comment\n"; fail " *This is a comment\n"; fail "!This is a comment"; fail "CThis is a comment"; fail "cThis is a comment"; fail "*This is a comment"; pass2 " !This is a comment\n", "!This is a comment\n"; fail " !This is a comment\n"; pass2 " !This is a comment\n", "!This is a comment\n"; exit if $] < 5.006; exit if $] < 5.008; try $RE{comment}{Beatnik}; $M .= "# Beatnik\n"; pass "is"; pass "IS"; pass "whiskers"; fail "whisker"; fail "Zulu"; fail "Hello"; fail "Is a"; fail "Is;"; try2 $RE{comment}{COBOL}; $M .= "# COBOL\n"; fail "This is a comment\n"; fail "*This is a comment\n"; fail " *This is a comment\n"; fail " *This is a comment\n"; fail " *This is a comment\n"; fail " *This is a comment\n"; fail " *This is a comment\n"; pass2 " *This is a comment\n", "*This is a comment\n"; fail " *This is a comment\n"; fail " *This is a comment\n"; fail " *This is a comment\n"; fail " !This is a comment\n"; fail " *This is a comment"; fail " *This is a comment\n *This is a comment\n"; pass2 " ******************\n", "******************\n"; Regexp-Common-2016020301/t/test_curry.t000755 000765 000024 00000002742 12116413566 020175 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; my $num = $RE{num}{real}; # TEST BASE 10 try $num->{'-base' => '10'}{-sep}; pass 0; pass 1; pass 12; fail 1234567; pass 1.23456789; pass 12.23456789; pass 123.23456789; fail 1234.23456789; pass '+1'; pass '+12'; fail '+1234567.89'; pass '+1,234,567.89'; pass '-1'; pass '-12.333333333333333333333333333333333333333'; fail '-1234567'; pass -1; pass -12; fail -1234567; pass 1.2; fail "a"; fail ""; fail "1a"; fail "- 1234"; pass "1,234,567"; pass "12,345.6789"; fail "1,2345,6789"; fail "1.2345.6789"; # TEST BASE 2 try $num->{'-base' => '2'}{-sep}; pass 0; pass 1; fail 12; fail 1234567; fail 1.23456789; pass '+1'; fail '+12'; fail '+101010'; fail '+101010.0001010'; pass '+101,010.0001010'; fail '+1234567.89'; pass '-1'; pass -1; fail "a"; fail ""; fail "1a"; fail "- 1010"; pass "1,001,101"; pass "1,010.1110"; fail "1,0101,0011"; fail "1.0011.0011"; try $RE{num}{bin}{-sep}; pass 0; pass 1; fail 12; fail 1234567; fail 1.23456789; pass '+1'; fail '+12'; fail '+101010'; fail '+101010.0001010'; fail '+1234567.89'; pass '-1'; pass -1; fail "a"; fail ""; fail "1a"; fail "- 1010"; pass "1,001,101"; pass "1,010.1110"; fail "1,0101,0011"; fail "1.0011.0011"; Regexp-Common-2016020301/t/test_delimited.t000755 000765 000024 00000004112 12643530521 020756 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; if ($] >= 5.006) { # This gives a 'panic: POPSTACK' in 5.005_* eval {"" =~ $RE {delimited}}; ok $@ =~ /Must specify delimiter in \$RE\{delimited}/; } try $RE {delimited} {-delim => ' '}; pass q { a-few-words }; pass q { a\ few\ words }; fail q { a few words }; try $RE{delimited}{qq{-delim$;"}}; pass q{"a few words "}; pass q{"a few \"words\" "}; pass q{"a few 'words' "}; fail q{"a few "words" "}; fail q{'a few words '}; fail q{'a few \"words\" '}; fail q{'a few "words" '}; fail q{a "few" words "}; try $RE{delimited}{qq{-delim$;"}}{qq{-esc$;"}}; pass q{"a few words "}; fail q{"a few \"words\" "}; pass q{"a few ""words"" "}; pass q{"a few 'words' "}; fail q{"a few "words" "}; fail q{a "few" words "}; try $RE{delimited}{qq{-delim$;'}}; fail q{"a few words "}; fail q{"a few \"words\" "}; fail q{"a few 'words' "}; fail q{"a few "words" "}; pass q{'a few words '}; pass q{'a few \"words\" '}; pass q{'a few "words" '}; fail q{a "few" words "}; try $RE{quoted}; pass q{"a few words "}; pass q{"a few \"words\" "}; pass q{"a few 'words' "}; fail q{"a few "words" "}; pass q{'a few words '}; pass q{'a few \"words\" '}; pass q{'a few "words" '}; fail q{a "few" words "}; try $RE{quoted}{qq{-esc$;_!}}; pass q{"a few words "}; fail q{"a few \"words\" "}; pass q{"a few _"words_" "}; pass q{"a few 'words' "}; fail q{"a few "words" "}; pass q{'a few words '}; fail q{'a few \'words\' '}; pass q{'a few !'words!' '}; pass q{'a few "words" '}; fail q{a "few" words "}; try $RE{quoted}{qq{-esc$;}}; pass q{"a few words "}; fail q{"a few \"words\" "}; fail q{"a few _"words_" "}; pass q{"a few 'words' "}; fail q{"a few "words" "}; pass q{'a few words '}; fail q{'a few \'words\' '}; fail q{'a few !'words!' '}; pass q{'a few "words" '}; fail q{a "few" words "}; Regexp-Common-2016020301/t/test_domain.t000755 000765 000024 00000006072 12117651121 020270 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; # Domains. my @data = ( ['host.example.com' => 'PPPP'], ['a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z' => 'PPPP'], ['A.B.C.D.E.F.G.H.I.J.K.L.M.N.O.P.Q.R.S.T.U.V.W.X.Y.Z' => 'PPPP'], ['host1.example.com' => 'PPPP'], ['host-1.example.com' => 'PPPP'], ['host' => 'PPPP'], ['a-----------------1.example.com' => 'PPPP'], ['a123456a.example.com' => 'PPPP'], # # 63 char limit # ['abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-0123456789.com' => 'PPPP'], ['abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-0123456789a.com' => 'FFFF'], # # By default, we can match a single space, but not two # [' ', => 'PFPF'], [' ' => 'FFFF'], # # Parts may only start with a number if -rfc1101 is given # ['123host.example.com' => 'FFPP'], ['host.12example.com' => 'FFPP'], # # But it may not look it starts with an IP address # ['127.0.0.1' => 'FFFF'], ['127.0.0.1.com' => 'FFFF'], ['127.0.0.1333.com' => 'FFPP'], # # Parts may not end with a dash # ['host-.example.com' => 'FFFF'], # # May not end with a dot # ['host.example.com.' => 'FFFF'], # # Mind your dots and spaces # ['host. .example.com' => 'FFFF'], ['host..example.com' => 'FFFF'], ['host .example.com' => 'FFFF'], ['ho st.example.com' => 'FFFF'], ); my @pats = ( ['$RE {net} {domain}' => $RE {net} {domain}], ['$RE {net} {domain} {-nospace}' => $RE {net} {domain} {-nospace}], ['$RE {net} {domain} {-rfc1101}' => $RE {net} {domain} {-rfc1101}], ['$RE {net} {domain} {-nospace} {-rfc1101}' => $RE {net} {domain} {-nospace} {-rfc1101}], ); foreach (my $i = 0; $i < @pats; $i ++) { my ($name, $pat) = @{$pats [$i]}; try $pat; $M .= "# Trying $name\n"; foreach my $entry (@data) { my ($domain, $results) = @$entry; my $entry = substr $results, $i, 1; $entry eq 'P' ? pass $domain : fail $domain; } } Regexp-Common-2016020301/t/test_i.t000755 000765 000024 00000004161 12116636105 017252 0ustar00abigailstaff000000 000000 #!/usr/bin/perl # Eventually, this should be tested from the individual test files. use strict; use lib qw {blib/lib}; use Regexp::Common qw /RE_ALL/; use warnings; my @data = ( [[qw /num hex/] => ["abcdef", "123.456", "1a2B.3c"]], [[qw /comment ILLGOL/] => ["NB foo bar\n", "nb foo bar\n"]], [[qw /net domain/] => ["www.perl.com", "WWW.PERL.COM"]], [[qw /net MAC/] => ["a0:b0:c0:d0:e0:f0", "A0:B0:C0:D0:E0:F0"]], [[qw /zip Dutch/] => ["1234 ab", "1234 AB", "nl-1234 AB"]], [[qw /URI HTTP/] => ["HTTP://WWW.PERL.COM"]], [[qw /profanity/] => [map {local $_ = $_; y/a-zA-Z/n-za-mN-ZA-M/; $_} qw / pbpx-fhpxre srygpuvat zhgure-shpxre zhgun-shpxvat fuvgf fuvgre penccvat nefr-ubyr cvff-gnxr jnaxf/]], [[qw /num roman/] => [qw /I i II ii XvIiI CXxxVIiI MmclXXviI/]], ); if ($] >= 5.010) { push @data => ( [[qw /balanced/] => ["()", "(a( )b)"]], ); } my $total = 1; $total += 2 * @{$_ -> [1]} for @data; print "1..$total\n"; print defined $Regexp::Common::VERSION ? "ok 1\n" : "not ok 1\n"; my $count = 1; sub pass { my @a = @_; $a [0] =~ y/a-zA-Z/n-za-mN-ZA-M/ if $a [1] =~ /profanity/; $a [0] =~ s/\n/\\n/g; printf "ok %d - '%s' =~ %s\n", ++ $count, @a } sub fail { my @a = @_; $a [0] =~ y/a-zA-Z/n-za-mN-ZA-M/ if $a [1] =~ /profanity/; $a [0] =~ s/\n/\\n/g; printf "not ok %d - '%s' =~ %s\n", ++ $count, @a } foreach my $data (@data) { my ($name, $queries) = @$data; foreach my $str (@$queries) { local $" = "}{"; eval "\$str =~ /^\$RE{@$name}{-i}\$/ ? pass \$str, '\$RE{@$name}{-i}' : fail \$str, '\$RE{@$name}{-i}'"; die $@ if $@; local $" = "_"; eval "\$str =~ RE_@$name (-i => 1) ? pass \$str, 'RE_@$name (-i => 1)', : fail \$str, 'RE_@$name (-i => 1)'"; die $@ if $@; } } __END__ Regexp-Common-2016020301/t/test_ip.t000755 000765 000024 00000004141 12116413566 017434 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} # sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} sub try { $P = qr /^$_[0]/ } sub pass { ok ($_ [0] =~ $P && $& eq $_ [0]) } sub fail { ok ($_ [0] !~ $P || $& ne $_ [0]) } # LOAD use Regexp::Common; ok; # DOTTED DECIMAL try $RE{net}{IPv4}; pass '0.0.0.0'; pass '1.1.1.1'; pass '255.255.255.255'; pass '255.0.128.23'; fail '256.0.128.23'; fail '255.0.1287.23'; fail '255.a.127.23'; fail '255 0 127 23'; fail '255,0,127,23'; fail '255012723'; try $RE{net}{IPv4}{dec}; pass '0.0.0.0'; pass '1.1.1.1'; pass '255.255.255.255'; pass '255.0.128.23'; fail '256.0.128.23'; fail '255.0.1287.23'; fail '255.a.127.23'; fail '255 0 127 23'; fail '255,0,127,23'; fail '255012723'; # DOTTED HEXADECIMAL # try $RE{net}{IPv4}{hex}; pass '0.0.0.0'; pass '1.1.1.1'; pass '55.55.25.5'; pass '7A.B4.2C.D'; pass 'FF.FF.FF.FF'; fail 'FF.FF.FF.1FF'; fail '255.0.1287.23'; fail '255.a.127.23'; fail '255 0 127 23'; fail '255,0,127,23'; fail '255012723'; try $RE{net}{IPv4}{hex}{-sep=>""}; fail '0.0.0.0'; fail '1.1.1.1'; pass '55552505'; pass '7AB42CD'; pass 'FFFFFFFF'; fail 'FFFFFF1FF'; fail '55 55 25 05'; fail '7A B4 2C D'; fail 'FF FF FF FF'; fail 'FF FF FF 1FF'; try $RE{net}{IPv4}{hex}{-sep=>" "}; fail '0.0.0.0'; fail '1.1.1.1'; fail '55552505'; fail '7AB42CD'; fail 'FFFFFFFF'; fail 'FFFFFF1FF'; pass '55 55 25 05'; pass '7A B4 2C D'; pass 'FF FF FF FF'; fail 'FF FF FF 1FF'; # DOTTED OCTAL # try $RE{net}{IPv4}{oct}; pass '0.0.0.0'; pass '1.1.1.1'; pass '55.55.25.5'; fail '7A.B4.2C.D'; pass '377.377.377.377'; fail '400.400.400.400'; fail '377.377.377.1377'; fail '255.a.127.23'; fail '255 0 127 23'; fail '255,0,127,23'; fail '255012723'; # DOTTED BINARY # try $RE{net}{IPv4}{bin}; pass '0.0.0.0'; pass '1.1.1.1'; pass '101010.101011.1.10000000'; fail '12.01.01.01'; fail '101010101.101011.1.10000000'; fail '10101010-101011-1-10000000'; fail '10101010101011110000000'; Regexp-Common-2016020301/t/test_ipv6.t000644 000765 000024 00000013223 12120063364 017676 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use warnings; no warnings 'syntax'; BEGIN { if ($] < 5.010) { print "1..1\n"; print "ok 1 # \$RE {net} {IPv6} requires 5.010\n"; exit; } } use Regexp::Common; my $count = 0; my $PAT; END {print "1..$count\n"} sub try { $PAT = shift; my $name = shift; print "# $name\n"; } sub pass { my $address = shift; my $r = $address =~ /^$PAT/ && $address eq $&; printf "%s %d # Matching %s\n", $r ? "ok" : "not ok", ++ $count, $address; } sub fail { my $address = shift; my $r = $address !~ /^$PAT/ || $address ne $&; printf "%s %d # Failing %s\n", $r ? "ok" : "not ok", ++ $count, $address; } sub match { my $address = $_ [0]; my @matches = @_; my $r = $address =~ /^$PAT/ && $address eq $&; printf "%s %d # Matching %s\n", $r ? "ok" : "not ok", ++ $count, $address; if (!$r) { for my $i (0 .. @matches) { printf "not ok %d # SKIP\n" => ++ $count; } return; } # # Correct number of matches? # printf "%s %d # Number of matches\n" => @matches == @- - 1 ? "ok" : "not ok", ++ $count; for (my $i = 0; $i < @matches; $i ++) { no strict 'refs'; my $matched = ${$i + 1}; printf "%s %d # \$%d eq '%s'\n" => $matched eq $matches [$i] ? "ok" : "not ok", ++ $count, $i + 1, $matches [$i]; } } try $RE {net} {IPv6} => '$RE {net} {IPv6}'; pass "2001:0db8:85a3:0000:0000:8a2e:0370:7334"; pass "2001:db8:85a3:0:0:8a2e:370:7334"; # Leading 0's removed pass "2001:DB8:85A3:0:0:8A2E:370:7334"; # Upper case allowed pass "2001:Db8:85A3:0:0:8a2E:370:7334"; # Mixed case allowed pass "2001:db8:85a3::8a2e:370:7334"; # Contractions pass "2001:db8::8a2e:370:7334"; pass "2001::8a2e:370:7334"; pass "::8a2e:370:7334"; pass "::370:7334"; pass "::7334"; pass "::"; fail "2001:0db8:85a3:0000:0000:8a2e:0370:7334:1234"; # Too many parts fail "2001:0db8:85a3:0000:0000:8a2e:0370"; # Not enough parts fail "20013:db8:85a3:0:0:8a2e:370:7334"; # Part too long fail "2001:0db8:85a3:0000:0000:8a2e:0370:7334:"; # Trailing separator fail ":2001:0db8:85a3:0000:0000:8a2e:0370:7334"; # Leading separator fail "2001:db8:85a3:0::8a2e:370:7334"; # Only one unit removed fail "2001::8a2e:370::7334"; # Two contractions fail "2001:::8a2e:370:7334"; # Three separators fail "2001.db8.85a3.0.0.8a2e.370.7334"; # Wrong separator try $RE {net} {IPv6} {-style => "hex"} => '$RE {net} {IPv6} {-style => "hex"}'; pass "2001:db8:85a3:0:0:8a2e:370:7334"; # Lower case allowed fail "2001:DB8:85A3:0:0:8A2E:370:7334"; # Upper case not allowed fail "2001:Db8:85A3:0:0:8a2E:370:7334"; # Mixed case not allowed try $RE {net} {IPv6} {-style => "HEX"} => '$RE {net} {IPv6} {-style => "HEX"}'; fail "2001:db8:85a3:0:0:8a2e:370:7334"; # Lower case allowed pass "2001:DB8:85A3:0:0:8A2E:370:7334"; # Upper case not allowed fail "2001:Db8:85A3:0:0:8a2E:370:7334"; # Mixed case not allowed try $RE {net} {IPv6} {-sep => "[.]"} => '$RE {net} {IPv6} {-sep => "[.]"}'; pass "2001.db8.85a3.0.0.8a2e.370.7334"; # Lower case allowed pass "2001.DB8.85A3.0.0.8A2E.370.7334"; # Upper case allowed fail "2001:db8:85a3:0:0:8a2e:370:7334"; # Fail on default sep try $RE {net} {IPv6} {-keep} => '$RE {net} {IPv6} {-keep}'; match "2001:0db8:85a3:0000:0000:8a2e:0370:7334" => "2001", "0db8", "85a3", "0000", "0000", "8a2e", "0370", "7334"; match "2001:0db8:85a3:0:0:8a2e:0370:7334" => "2001", "0db8", "85a3", "0", "0", "8a2e", "0370", "7334"; match "2001:db8:85a3:0:0:8a2e:370:7334" => "2001", "db8", "85a3", "0", "0", "8a2e", "370", "7334"; match "2001:db8:85a3::8a2e:370:7334" => "2001", "db8", "85a3", "", "", "8a2e", "370", "7334"; match "2001:db8::8a2e:370:7334" => "2001", "db8", "", "", "", "8a2e", "370", "7334"; match "2001::8a2e:370:7334" => "2001", "", "", "", "", "8a2e", "370", "7334"; match "::8a2e:370:7334" => "", "", "", "", "", "8a2e", "370", "7334"; match "::370:7334" => "", "", "", "", "", "", "370", "7334"; match "::7334" => "", "", "", "", "", "", "", "7334"; match "::" => "", "", "", "", "", "", "", ""; fail "2001:0db8:85a3:0000:0000:8a2e:0370:7334:1234"; # Too many parts fail "2001:0db8:85a3:0000:0000:8a2e:0370"; # Not enough parts fail "20013:db8:85a3:0:0:8a2e:370:7334"; # Part too long fail "2001:0db8:85a3:0000:0000:8a2e:0370:7334:"; # Trailing separator fail ":2001:0db8:85a3:0000:0000:8a2e:0370:7334"; # Leading separator fail "2001:db8:85a3:0::8a2e:370:7334"; # Only one unit removed fail "2001::8a2e:370::7334"; # Two contractions fail "2001:::8a2e:370:7334"; # Three separators fail "2001.db8.85a3.0.0.8a2e.370.7334"; # Wrong separator try $RE {net} {IPv6} {-style => 'HEX'} {-sep => '[.]'} {-keep} => q [$RE {net} {IPv6} {-style => 'HEX'} {-sep => '[.]'} {-keep}]; match "2001.DB8.85A3..8A2E.370.7334" => "2001", "DB8", "85A3", "", "", "8A2E", "370", "7334"; __END__ Regexp-Common-2016020301/t/test_keep.t000755 000765 000024 00000012066 12116636120 017746 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{my($S,@M)=@_;my $C=0;unshift@M,$S; print"wanted\t[",join('][',@M),"]\n";print"got\t[",join('][',$S=~$P),"]\n";} sub pass{my($S,@M)=@_;my$C=0;unshift@M,$S;foreach($S=~$P){++$C and next if(shift()eq$_);ok(0)&&return;}ok($C>0);} # Shut up some warnings for 5.005. $SIG{__WARN__} = sub { }; # LOAD use Regexp::Common; ok; if ($] >= 5.010) { try $RE{balanced}{-keep}; pass '(a(b))'; try $RE{balanced}{-begin => ">>"}{-end => "<<"}{-keep}; pass '>>>>>>a<<>>b<<<<>>c<<<<'; } try $RE{num}{real}{-keep}; pass '-1.234e+567', qw( - 1.234 1 . 234 e +567 + 567 ); try $RE{num}{dec}{-keep}; pass '-1.234e+567', qw( - 1.234 1 . 234 e +567 + 567 ); try $RE{num}{real}{'-base' => '2'}{-expon=>'x2\^'}{-keep}; pass '-101.010x2^101010', qw( - 101.010 101 . 010 x2^ 101010 ), "", "101010"; try $RE{num}{bin}{-keep}; pass '-101.010E101010', qw( - 101.010 101 . 010 E 101010 ), "", "101010"; try $RE{num}{real}{'-base' => '10'}{-sep}{-keep}; pass '-1,234,567.234e+567', "-", "1,234,567.234", "1,234,567", ".", "234", "e", "+567", "+", "567"; try $RE{comment}{C}{-keep}; pass '/*abc*/', qw( /* abc */ ); try $RE{comment}{'C++'}{-keep}; pass '/*abc*/'; pass "// abc\n"; try $RE{comment}{Perl}{-keep}; pass "# abc\n", "#", " abc", "\n"; try $RE{comment}{shell}{-keep}; pass "# abc\n", "#", " abc", "\n"; try $RE{comment}{Eiffel}{-keep}; pass "-- A comment\n", "--", " A comment", "\n"; pass "---- A comment\n", "--", "-- A comment", "\n"; try $RE{comment}{SQL}{-keep}; pass "-- A comment\n", "--", " A comment", "\n"; pass "---- A comment\n", "----", " A comment", "\n"; try $RE{comment}{HTML}{-keep}; pass "", ""; pass "", ""; pass "", ""; try RE_delimited(-delim=>'/'); pass '/a\/b/', qw( / a\/b / ); try RE_delimited(-delim=>'/', -esc=>'/'); pass '/a//b/', qw( / a//b / ); try RE_net_IPv4; pass '123.234.1.0', qw( 123 234 1 0 ); try RE_list_conj(-word=>'(?:and|or)'); pass 'a, b, and c', ', and '; my $profane = 'uneqba'; my $contextual = 'funttref'; foreach ($profane, $contextual) { tr/A-Za-z/N-ZA-Mn-za-m/ } try RE_profanity; pass $profane; try RE_profanity_contextual; pass $contextual; Regexp-Common-2016020301/t/test_sub_named.t000755 000765 000024 00000001162 12116636251 020757 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common qw (RE_balanced RE_num_real); ok; if ($] >= 5.010) { try RE_balanced; pass '(a(b))'; fail '(a(b)'; } try RE_num_real; pass '-1.234e+567', qw( - 1.234 1 . 234 e +567 + 567 ); try RE_num_real(-base=>2,-expon=>'x2\^'); pass '-101.010x2^101010', qw( - 101.010 101 . 010 x2^ 101010 ), "", "101010"; Regexp-Common-2016020301/t/test_ws.t000755 000765 000024 00000001637 12116413566 017464 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/$_[0]/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; try $RE{ws}{crop}; pass " a sentence here\t\t"; pass " a sentence here"; pass "a sentence here\t\t"; fail "a sentence here"; ok $RE{ws}{crop}->matches(" a sentence here\t\t"); ok $RE{ws}{crop}->matches(" a sentence here"); ok $RE{ws}{crop}->matches("a sentence here\t\t"); ok ! $RE{ws}{crop}->matches("a sentence here"); ok 'a sentence here' eq $RE{ws}{crop}->subs(" a sentence here\t\t"); ok 'a sentence here' eq $RE{ws}{crop}->subs(" a sentence here"); ok 'a sentence here' eq $RE{ws}{crop}->subs("a sentence here\t\t"); ok 'a sentence here' eq $RE{ws}{crop}->subs("a sentence here"); Regexp-Common-2016020301/t/URI/000755 000765 000024 00000000000 12654450573 016242 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/t/zip/000755 000765 000024 00000000000 12654450573 016405 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/t/zzz_50_pod.t000755 000765 000024 00000000373 12116413566 017773 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; eval "use Test::More; 1" or do { print "1..0 # SKIP Test::More required\n"; exit; }; eval "use Test::Pod 1.00; 1" or plan (skip_all => "Test::Pod required for testing POD"); all_pod_files_ok (); __END__ Regexp-Common-2016020301/t/zzz_60_pod_coverage.t000755 000765 000024 00000000456 12116413566 021651 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; eval "use Test::More; 1" or do { print "1..0 # SKIP Test::More required\n"; exit; }; eval "use Test::Pod::Coverage 1.00; 1" or plan (skip_all => "Test::Pod::Coverage required for testing POD coverage"); all_pod_coverage_ok ({private => [qr /^/]}); __END__ Regexp-Common-2016020301/t/zip/australia.t000755 000765 000024 00000012575 12116413566 020566 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use warnings; sub passes; sub failures; use constant FAIL => 5; my $normal = $RE {zip} {Australia}; my $prefix = $RE {zip} {Australian} {-prefix => 'yes'}; my $no_prefix = $RE {zip} {Australia} {-prefix => 'no'}; my $iso = $RE {zip} {Australian} {-country => "iso"}; my $cept = $RE {zip} {Australia} {-country => "cept"}; my $country = $RE {zip} {Australian} {-country => "Aus"}; my $iso_prefix = $iso -> {-prefix => 'yes'}; my $cept_prefix = $cept -> {-prefix => 'yes'}; my @tests = ( [ normal => $normal => [qw /1 1 1 0/]], [ prefix => $prefix => [qw /0 1 1 0/]], ['no prefix' => $no_prefix => [qw /1 0 0 0/]], [ iso => $iso => [qw /1 0 1 0/]], [ cept => $cept => [qw /1 1 0 0/]], [ country => $country => [qw /1 0 0 1/]], ['iso prefix' => $iso_prefix => [qw /0 0 1 0/]], ['cept prefix' => $cept_prefix => [qw /0 1 0 0/]], ); my @states = (2, 8, 9, '02', '08', '09', 10 .. 97); my @failures = failures; my $count; sub mess {print ++ $count, " - $_ (@_)\n"} sub pass {print "ok "; &mess} sub fail {print "not ok "; &mess} my $m = 0; my $k = 0; foreach my $test (@tests) { $m ++ foreach @{$test -> [2]}; $k ++ foreach grep {$_} @{$test -> [2]}; } my $max = 1; $max += @states * $m; $max += @states * $k; $max += @failures * @tests; print "1..$max\n"; print "ok ", ++ $count, "\n"; sub run_test { my ($name, $re, $should_match) = @_; my $match = "<<$_>>" =~ /$re/; my $good = $match && $_ eq $&; my $line = $good ? "match" : $match ? "wrong match (got: $&)" : "no match"; $line .= "; $name"; if ($should_match) {$good ? pass $line : fail $line} else {$good ? fail $line : pass $line} } sub array_cmp { my ($a1, $a2) = @_; return 0 unless @$a1 eq @$a2; foreach my $i (0 .. $#$a1) { !defined $$a1 [$i] && !defined $$a2 [$i] || defined $$a1 [$i] && defined $$a2 [$i] && $$a1 [$i] eq $$a2 [$i] or return 0; } return 1; } sub __ {map {defined () ? $_ : "UNDEF"} @_} sub run_keep { my ($name, $re, $parts) = @_; my @chunks = /^$re->{-keep}$/; unless (@chunks) {fail "no match; $name - keep"; return} array_cmp (\@chunks, [$_ => @$parts]) ? pass "match; $name - keep" : fail "wrong match [@{[__ @chunks]}]; $name - keep" } sub _ { my $min = $_ [0]; my $max = @_ > 1 ? $_ [1] : $_ [0]; my $x = ""; $x .= int rand 10 for 1 .. $_ [0] + int rand (1 + $max - $min); $x; } my %cache; foreach my $x (@states) { my $y = $x =~ /9$/ ? "09" : _ 2; my @t = ([undef, "$x$y", $x, $y], ["AUS", "$x$y", $x, $y], ["AU", "$x$y", $x, $y], ["Aus", "$x$y", $x, $y]); my $c = 0; foreach my $t (@t) { local $_ = defined $t -> [0] ? $t -> [0] . "-" : ""; $_ .= join "" => @$t [2 .. 3]; foreach my $test (@tests) { my ($name, $re, $matches) = @$test; run_test $name, $re, $matches -> [$c]; run_keep $name, $re -> {-keep}, $t if $matches -> [$c]; } $c ++; } } foreach (@failures) { foreach my $test (@tests) { my ($name, $re) = @$test; /^$re$/ ? fail "match; $name" : pass "no match; $name"; } } sub failures { my @failures = ("", " "); # Too short. push @failures => 0 .. 9; for (1 .. FAIL) { my $x = _ 1, 3; redo if $x =~ /^[28]..$/ || $x eq "909" || $cache {$x} ++; push @failures => $x; } # Too long. for (1 .. FAIL) { my $x = _ 5, 10; redo if $cache {$x} ++; push @failures => $x; } for my $c ('.', ';', '-', ' ', '+') { for (1 .. FAIL) { my $x = _ 3; $x .= $c; redo if $cache {$x} ++; push @failures => $x; } for (1 .. FAIL) { my $x = _ 3; $x = "$c$x"; redo if $cache {$x} ++; push @failures => $x; } } # Wrong states for my $s ('00', '01', '03' .. '07') { my $x = _ 2; my $zip = "$s$x"; redo if $cache {$zip} ++; push @failures => $zip; } # Test leading '9'/'09'. OUTER: for (1 .. FAIL) { my $x = _ 2; redo if $x eq "09"; for my $s ("9", "09") { my $zip = "$s$x"; redo OUTER if $cache {$zip} ++; push @failures => $zip; } } # Same failures, with country in front of it as well. push @failures => map {("AUS-$_", "AU-$_")} @failures; # Wrong countries. for (1 .. FAIL) { my $c = join "" => map {('A' .. 'Z') [rand 26]} 1 .. 1 + int rand 3; redo if $c eq "AUS" || $c eq "AU" || $cache {$c} ++; my $x = _ 4; push @failures => "$c-$x"; } for (1 .. FAIL) { my $c = ('A' .. 'Z') [rand 26]; redo if $cache {$c} ++; my $x = _ 4; push @failures => "${c}AUS-$x"; push @failures => "AUS$c-$x"; push @failures => "${c}AU-$x"; $c =~ y!S!Z!; push @failures => "AU$c-$x"; } for (1 .. FAIL) { my $x = _ 4; push @failures => "aus-$x", "au-$x"; } @failures; } __END__ Regexp-Common-2016020301/t/zip/belgium.t000755 000765 000024 00000007721 12116413566 020222 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; use warnings; sub create_parts; my $normal = $RE {zip} {Belgium}; my $prefix = $RE {zip} {Belgium} {-prefix => 'yes'}; my $no_prefix = $RE {zip} {Belgium} {-prefix => 'no'}; my $iso = $RE {zip} {Belgium} {-country => "iso"}; my $cept = $RE {zip} {Belgium} {-country => "cept"}; my $country = $RE {zip} {Belgium} {-country => "BEL"}; my $iso_prefix = $iso -> {-prefix => 'yes'}; my $cept_prefix = $cept -> {-prefix => 'yes'}; my @tests = ( [ normal => $normal => {no_prefix => NORMAL_PASS | FAIL, iso_prefix => NORMAL_PASS | FAIL, cept_prefix => NORMAL_PASS | FAIL, prefix_b => NORMAL_FAIL, prefix_be => NORMAL_FAIL, prefix_BEL => NORMAL_FAIL}], [ prefix => $prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], ['no prefix' => $no_prefix => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_FAIL, cept_prefix => NORMAL_FAIL}], [ iso => $iso => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_FAIL}], [ cept => $cept => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_FAIL, cept_prefix => NORMAL_PASS}], [ country => $country => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_FAIL, cept_prefix => NORMAL_FAIL, prefix_BEL => NORMAL_PASS}], ['iso prefix' => $iso_prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_FAIL}], ['cept prefix' => $cept_prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_FAIL, cept_prefix => NORMAL_PASS}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::zip", tests => \@tests, good => $good, bad => $bad, query => \&zip, wanted => \&wanted; sub zip { my ($tag, $parts) = @_; my $zip = $$parts [0] . $$parts [1]; return $zip if $tag eq "no_prefix"; return "BE-$zip" if $tag eq "iso_prefix"; return "B-$zip" if $tag eq "cept_prefix"; return "$1-$zip" if $tag =~ /^prefix_(.*)/; die "Unknown tag '$tag' in &zip\n"; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = undef; $wanted [1] = "BE" if $tag eq "iso_prefix"; $wanted [1] = "B" if $tag eq "cept_prefix"; $wanted [1] = $1 if $tag =~ /^prefix_(.*)/; $wanted [2] = $$parts [0] . $$parts [1]; push @wanted => @$parts [0, 1]; return \@wanted; } sub _ { my ($min, $max, $cache) = @_; my $x; { $x = ""; $x .= int rand 10 for 1 .. $_ [0] + int rand (1 + $max - $min); redo if $cache -> {$x} ++; } $x; } sub create_parts { my (@good, @bad); # Provinces. $good [0] = [1 .. 9]; $bad [0] = [0]; # Distributions. my $c = {'000' => 1}; $good [1] = ['000', map {_ 3, 3, $c} 2 .. 20]; $bad [1] = ["", "fnord", (map {_ 1, 2, $c} 1 .. 4), (map {_ 4, 6, $c} 1 .. 4)]; # 'Fake' entries for "wrong" country codes. (\@good, \@bad) } __END__ Regexp-Common-2016020301/t/zip/denmark.t000755 000765 000024 00000010453 12116413566 020213 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; use warnings; sub create_parts; my $normal = $RE {zip} {Denmark}; my $iso = $RE {zip} {Denmark} {-country => "iso"}; my $cept = $RE {zip} {Denmark} {-country => "cept"}; my $country = $RE {zip} {Denmark} {-country => "DEN"}; my ($prefix, $no_prefix, $iso_prefix, $cept_prefix); unless ($] < 5.00503) { $prefix = $RE {zip} {Denmark} {-prefix => 'yes'}; $no_prefix = $RE {zip} {Denmark} {-prefix => 'no'}; $iso_prefix = $iso -> {-prefix => 'yes'}; $cept_prefix = $cept -> {-prefix => 'yes'}; } my @tests = ( [ normal => $normal => {no_prefix => NORMAL_PASS | FAIL, iso_prefix => NORMAL_PASS | FAIL, cept_prefix => NORMAL_PASS | FAIL, prefix_dk => NORMAL_FAIL, prefix_DEN => NORMAL_FAIL}], [ iso => $iso => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], [ cept => $cept => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], [ country => $country => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_FAIL, cept_prefix => NORMAL_FAIL, prefix_DEN => NORMAL_PASS}], ); push @tests => ( [ prefix => $prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], ['no prefix' => $no_prefix => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_FAIL, cept_prefix => NORMAL_FAIL}], ['iso prefix' => $iso_prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], ['cept prefix' => $cept_prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], ) unless $] < 5.00503; my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::zip", tests => \@tests, good => $good, bad => $bad, query => \&zip, wanted => \&wanted; sub zip { my ($tag, $parts) = @_; my $zip = $$parts [0] . $$parts [1] . $$parts [2]; return $zip if $tag eq "no_prefix"; return "DK-$zip" if $tag eq "iso_prefix"; return "DK-$zip" if $tag eq "cept_prefix"; return "$1-$zip" if $tag =~ /^prefix_(.*)/; die "Unknown tag '$tag' in &zip\n"; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = undef; $wanted [1] = "DK" if $tag eq "iso_prefix"; $wanted [1] = "DK" if $tag eq "cept_prefix"; $wanted [1] = $1 if $tag =~ /^prefix_(.*)/; $wanted [2] = $$parts [0] . $$parts [1] . $$parts [2]; push @wanted => @$parts [0, 1, 2]; return \@wanted; } sub _ { my ($min, $max, $cache, $exclude) = @_; $exclude ||= {}; my $x; { $x = ""; $x .= int rand 10 for 1 .. $_ [0] + int rand (1 + $max - $min); redo if $exclude -> {$x} || $cache -> {$x} ++; } $x; } sub create_parts { my (@good, @bad); # Distribution regions. my $a = {}; $good [0] = [map {_ 1, 1, $a, {0 => 1}} 1 .. 4]; $bad [0] = [0]; # Distribution district. my $b = {0 => 1}; $good [1] = [0, map {_ 1, 1, $b} 1 .. 4]; $bad [1] = ['a']; # Other numbers. my $c = {'00' => 1}; $good [2] = ['00', map {_ 2, 2, $c} 2 .. 10]; $bad [2] = ["", "fnord", (map {_ 1, 1, $c} 1 .. 2), (map {_ 3, 6, $c} 1 .. 2)]; (\@good, \@bad) } __END__ Regexp-Common-2016020301/t/zip/france.t000755 000765 000024 00000012017 12116413566 020026 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use warnings; sub passes; sub failures; use constant PASSES => 20; use constant FAIL => 5; my $normal = $RE {zip} {France}; my $prefix = $RE {zip} {French} {-prefix => 'yes'}; my $no_prefix = $RE {zip} {France} {-prefix => 'no'}; my $iso = $RE {zip} {French} {-country => "iso"}; my $cept = $RE {zip} {France} {-country => "cept"}; my $country = $RE {zip} {French} {-country => "Fr"}; my $iso_prefix = $iso -> {-prefix => 'yes'}; my $cept_prefix = $cept -> {-prefix => 'yes'}; my @tests = ( [ normal => $normal => [qw /1 1 1 0/]], [ prefix => $prefix => [qw /0 1 1 0/]], ['no prefix' => $no_prefix => [qw /1 0 0 0/]], [ iso => $iso => [qw /1 0 1 0/]], [ cept => $cept => [qw /1 1 0 0/]], [ country => $country => [qw /1 0 0 1/]], ['iso prefix' => $iso_prefix => [qw /0 0 1 0/]], ['cept prefix' => $cept_prefix => [qw /0 1 0 0/]], ); my @depts = ('00' .. '98'); my @failures = failures; my $count; sub mess {print ++ $count, " - $_ (@_)\n"} sub pass {print "ok "; &mess} sub fail {print "not ok "; &mess} my $m = 0; my $k = 0; foreach my $test (@tests) { $m ++ foreach @{$test -> [2]}; $k ++ foreach grep {$_} @{$test -> [2]}; } my $max = 1; $max += @depts * $m; $max += @depts * $k; $max += @failures * @tests; print "1..$max\n"; print "ok ", ++ $count, "\n"; sub run_test { my ($name, $re, $should_match) = @_; my $match = "<<$_>>" =~ /$re/; my $good = $match && $_ eq $&; my $line = $good ? "match" : $match ? "wrong match (got: $&)" : "no match"; $line .= "; $name"; if ($should_match) {$good ? pass $line : fail $line} else {$good ? fail $line : pass $line} } sub array_cmp { my ($a1, $a2) = @_; return 0 unless @$a1 eq @$a2; foreach my $i (0 .. $#$a1) { !defined $$a1 [$i] && !defined $$a2 [$i] || defined $$a1 [$i] && defined $$a2 [$i] && $$a1 [$i] eq $$a2 [$i] or return 0; } return 1; } sub __ {map {defined () ? $_ : "UNDEF"} @_} sub run_keep { my ($name, $re, $parts) = @_; my @chunks = /^$re->{-keep}$/; unless (@chunks) {fail "no match; $name - keep"; return} array_cmp (\@chunks, [$_ => @$parts]) ? pass "match; $name - keep" : fail "wrong match [@{[__ @chunks]}]; $name - keep" } sub _ { my $min = $_ [0]; my $max = @_ > 1 ? $_ [1] : $_ [0]; my $x = ""; $x .= int rand 10 for 1 .. $_ [0] + int rand (1 + $max - $min); $x; } my %cache; # foreach my $d (1 .. PASSES) { foreach my $x (@depts) { my ($y) = qw /000/; while ($cache {$y} ++) { $y = _ 3; } my @t = ([undef, "$x$y", $x, $y], ["F", "$x$y", $x, $y], ["FR", "$x$y", $x, $y], ["Fr", "$x$y", $x, $y]); my $c = 0; foreach my $t (@t) { local $_ = defined $t -> [0] ? $t -> [0] . "-" : ""; $_ .= join "" => @$t [2 .. 3]; foreach my $test (@tests) { my ($name, $re, $matches) = @$test; run_test $name, $re, $matches -> [$c]; run_keep $name, $re -> {-keep}, $t if $matches -> [$c]; } $c ++; } } foreach (@failures) { foreach my $test (@tests) { my ($name, $re) = @$test; /^$re$/ ? fail "match; $name" : pass "no match; $name"; } } sub failures { my @failures = ("", " "); # Too short. push @failures => 0 .. 9; for (1 .. FAIL) { my $x = _ 2, 4; redo if $cache {$x} ++; push @failures => $x; } # Too long. for (1 .. FAIL) { my $x = _ 6, 10; redo if $cache {$x} ++; push @failures => $x; } for my $c ('.', ';', '-', ' ', '+') { for (1 .. FAIL) { my $x = _ 4; $x .= $c; redo if $cache {$x} ++; push @failures => $x; } for (1 .. FAIL) { my $x = _ 4; $x = "$c$x"; redo if $cache {$x} ++; push @failures => $x; } } # Wrong departments. for (1 .. FAIL) { my $x = _ 3; # redo if $cache {"00$x"} ++; push @failures => "99$x"; } # Same failures, with country in front of it as well. push @failures => map {("FR-$_", "F-$_")} @failures; # Wrong countries. for (1 .. FAIL) { my $c = join "" => map {('A' .. 'Z') [rand 26]} 1 .. 2; redo if $c eq "FR" || $c eq "F" || $cache {$c} ++; my $x = _ 5; push @failures => "$c-$x"; } for (1 .. FAIL) { my $c = ('A' .. 'Z') [rand 26]; redo if $cache {$c} ++; my $x = _ 5; push @failures => "${c}FR-$x"; push @failures => "FR$c-$x"; } for (1 .. FAIL) { my $x = _ 5; redo if $cache {"fr-$x"} ++; push @failures => "fr-$x", "f-$x"; } @failures; } __END__ Regexp-Common-2016020301/t/zip/germany.t000755 000765 000024 00000011537 12116413566 020240 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use warnings; sub passes; sub failures; use constant PASSES => 20; use constant FAIL => 5; my $normal = $RE {zip} {Germany}; my $prefix = $RE {zip} {German} {-prefix => 'yes'}; my $no_prefix = $RE {zip} {Germany} {-prefix => 'no'}; my $iso = $RE {zip} {German} {-country => "iso"}; my $cept = $RE {zip} {Germany} {-country => "cept"}; my $country = $RE {zip} {German} {-country => "DE"}; my $iso_prefix = $iso -> {-prefix => 'yes'}; my $cept_prefix = $cept -> {-prefix => 'yes'}; my @tests = ( [ normal => $normal => [qw /1 1 1/]], [ prefix => $prefix => [qw /0 1 1/]], ['no prefix' => $no_prefix => [qw /1 0 0/]], [ iso => $iso => [qw /1 0 1/]], [ cept => $cept => [qw /1 1 0/]], [ country => $country => [qw /1 0 1/]], ['iso prefix' => $iso_prefix => [qw /0 0 1/]], ['cept prefix' => $cept_prefix => [qw /0 1 0/]], ); my @failures = failures; my $count; sub mess {print ++ $count, " - $_ (@_)\n"} sub pass {print "ok "; &mess} sub fail {print "not ok "; &mess} my $m = 0; my $k = 0; foreach my $test (@tests) { $m ++ foreach @{$test -> [2]}; $k ++ foreach grep {$_} @{$test -> [2]}; } my $max = 1; $max += PASSES * $m; $max += PASSES * $k; $max += @failures * @tests; print "1..$max\n"; print "ok ", ++ $count, "\n"; sub run_test { my ($name, $re, $should_match) = @_; my $match = "<<$_>>" =~ /$re/; my $good = $match && $_ eq $&; my $line = $good ? "match" : $match ? "wrong match (got: $&)" : "no match"; $line .= "; $name"; if ($should_match) {$good ? pass $line : fail $line} else {$good ? fail $line : pass $line} } sub array_cmp { my ($a1, $a2) = @_; return 0 unless @$a1 eq @$a2; foreach my $i (0 .. $#$a1) { !defined $$a1 [$i] && !defined $$a2 [$i] || defined $$a1 [$i] && defined $$a2 [$i] && $$a1 [$i] eq $$a2 [$i] or return 0; } return 1; } sub __ {map {defined () ? $_ : "UNDEF"} @_} sub run_keep { my ($name, $re, $parts) = @_; my @chunks = /^$re->{-keep}$/; unless (@chunks) {fail "no match; $name - keep"; return} array_cmp (\@chunks, [$_ => @$parts]) ? pass "match; $name - keep" : fail "wrong match [@{[__ @chunks]}]; $name - keep" } sub _ { my $min = $_ [0]; my $max = @_ > 1 ? $_ [1] : $_ [0]; my $x = ""; $x .= int rand 10 for 1 .. $_ [0] + int rand (1 + $max - $min); $x; } my %cache; foreach my $d (1 .. PASSES) { my ($x, $y, $z) = qw /0 0 000/; while ($cache {"$x$y$z"} ++) { $x = _ 1; $y = _ 1; $z = _ 3; } my @t = ([undef, "$x$y$z", $x, $y, $z], ["D", "$x$y$z", $x, $y, $z], ["DE", "$x$y$z", $x, $y, $z]); my $c = 0; foreach my $t (@t) { local $_ = defined $t -> [0] ? $t -> [0] . "-" : ""; $_ .= join "" => @$t [2 .. 4]; foreach my $test (@tests) { my ($name, $re, $matches) = @$test; run_test $name, $re, $matches -> [$c]; run_keep $name, $re -> {-keep}, $t if $matches -> [$c]; } $c ++; } } foreach (@failures) { foreach my $test (@tests) { my ($name, $re) = @$test; /^$re$/ ? fail "match; $name" : pass "no match; $name"; } } sub failures { my @failures = ("", " "); # Too short. push @failures => 0 .. 9; for (1 .. FAIL) { my $x = _ 2, 4; redo if $cache {$x} ++; push @failures => $x; } # Too long. for (1 .. FAIL) { my $x = _ 6, 10; redo if $cache {$x} ++; push @failures => $x; } for my $c ('.', ';', '-', ' ', '+') { for (1 .. FAIL) { my $x = _ 4; $x .= $c; redo if $cache {$x} ++; push @failures => $x; } for (1 .. FAIL) { my $x = _ 4; $x = "$c$x"; redo if $cache {$x} ++; push @failures => $x; } } # Same failures, with country in front of it as well. push @failures => map {("DE-$_", "D-$_")} @failures; # Wrong countries. for (1 .. FAIL) { my $c = join "" => map {('A' .. 'Z') [rand 26]} 1 .. 2; redo if $c eq "DE" || $c eq "D" || $cache {$c} ++; my $x = _ 5; push @failures => "$c-$x"; } for (1 .. FAIL) { my $c = ('A' .. 'Z') [rand 26]; redo if $cache {$c} ++; my $x = _ 5; push @failures => "${c}DE-$x"; push @failures => "DE$c-$x"; } for (1 .. FAIL) { my $x = _ 5; redo if $cache {"de-$x"} ++; push @failures => "de-$x", "d-$x"; } @failures; } __END__ Regexp-Common-2016020301/t/zip/greenland.t000644 000765 000024 00000010141 12116413566 020520 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; use warnings; sub create_parts; my $normal = $RE {zip} {Greenland}; my $prefix = $RE {zip} {Greenland} {-prefix => 'yes'}; my $no_prefix = $RE {zip} {Greenland} {-prefix => 'no'}; my $iso = $RE {zip} {Greenland} {-country => "iso"}; my $cept = $RE {zip} {Greenland} {-country => "cept"}; my $country = $RE {zip} {Greenland} {-country => "DEN"}; my $iso_prefix = $iso -> {-prefix => 'yes'}; my $cept_prefix = $cept -> {-prefix => 'yes'}; my @tests = ( [ normal => $normal => {no_prefix => NORMAL_PASS | FAIL, iso_prefix => NORMAL_PASS | FAIL, cept_prefix => NORMAL_PASS | FAIL, prefix_dk => NORMAL_FAIL, prefix_DEN => NORMAL_FAIL}], [ prefix => $prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], ['no prefix' => $no_prefix => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_FAIL, cept_prefix => NORMAL_FAIL}], [ iso => $iso => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], [ cept => $cept => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], [ country => $country => {no_prefix => NORMAL_PASS, iso_prefix => NORMAL_FAIL, cept_prefix => NORMAL_FAIL, prefix_DEN => NORMAL_PASS}], ['iso prefix' => $iso_prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], ['cept prefix' => $cept_prefix => {no_prefix => NORMAL_FAIL, iso_prefix => NORMAL_PASS, cept_prefix => NORMAL_PASS}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::zip", tests => \@tests, good => $good, bad => $bad, query => \&zip, wanted => \&wanted; sub zip { my ($tag, $parts) = @_; my $zip = $$parts [0] . $$parts [1]; return $zip if $tag eq "no_prefix"; return "DK-$zip" if $tag eq "iso_prefix"; return "DK-$zip" if $tag eq "cept_prefix"; return "$1-$zip" if $tag =~ /^prefix_(.*)/; die "Unknown tag '$tag' in &zip\n"; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = undef; $wanted [1] = "DK" if $tag eq "iso_prefix"; $wanted [1] = "DK" if $tag eq "cept_prefix"; $wanted [1] = $1 if $tag =~ /^prefix_(.*)/; $wanted [2] = $$parts [0] . $$parts [1]; push @wanted => @$parts [0, 1]; return \@wanted; } sub _ { my ($min, $max, $cache, $exclude) = @_; $exclude ||= {}; my $x; { $x = ""; $x .= int rand 10 for 1 .. $_ [0] + int rand (1 + $max - $min); redo if $exclude -> {$x} || $cache -> {$x} ++; } $x; } sub create_parts { my (@good, @bad); # Distribution district. my $a = {}; $good [0] = [39]; again: $bad [0] = ['00', "quux", map {_ 2, 2, $a} 1 .. 3]; goto again if grep {$_ eq "39"} @{$bad [0]}; # Other numbers. my $c = {'00' => 1}; $good [1] = ['00', map {_ 2, 2, $c} 2 .. 10]; $bad [1] = ["", "fnord", (map {_ 1, 1, $c} 1 .. 2), (map {_ 3, 6, $c} 1 .. 2)]; (\@good, \@bad) } __END__ Regexp-Common-2016020301/t/zip/italy.t000755 000765 000024 00000007122 12116413566 017713 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common qw /RE_zip_Italy/; use t::Common qw /run_new_tests cross d pd dd a/; # use warnings; sub create_parts; my $italy = $RE {zip} {Italy}; my $yes_prefix = $RE {zip} {Italy} {-prefix => 'yes'}; my $no_prefix = $RE {zip} {Italy} {-prefix => 'no'}; my $iso_prefix = $RE {zip} {Italy} {-country => 'iso'}; my $cept_prefix = $RE {zip} {Italy} {-country => 'cept'}; my $own_prefix = $RE {zip} {Italy} {-country => 'it'}; use constant FAIL => 5; my @base = ([0, pd], [0, pd], [0, pd], [0, pd], [0, pd]); my $zips = [cross @base]; my @long = map {dd 6 => 10} 1 .. FAIL; my @short = map {dd 1 => 4} 1 .. FAIL; my @letter = map {my $z = dd 5; substr $z, rand (5), 1, a; $z} 1 .. FAIL; my $wrong = [@long, @short, @letter]; my %targets = ( no_prefix => { list => $zips, query => sub {join "" => @_}, wanted => sub {$_, undef, join ("" => @_), @_}, }, iso_prefix => { list => $zips, query => sub {"IT-" . join "" => @_}, wanted => sub {$_, "IT", join ("" => @_), @_}, }, cept_prefix => { list => $zips, query => sub {"I-" . join "" => @_}, wanted => sub {$_, "I", join ("" => @_), @_}, }, own_prefix => { list => $zips, query => sub {"it-" . join "" => @_}, wanted => sub {$_, "it", join ("" => @_), @_}, }, wrong1 => { list => $wrong, query => sub {$_ [0]}, }, wrong2 => { list => $wrong, query => sub {"IT-" . $_ [0]}, }, wrong3 => { list => $wrong, query => sub {"I-" . $_ [0]}, }, wrong4 => { list => $zips, query => sub {"IT " . join "" => @_}, }, ); my @wrongs = qw /wrong1 wrong2 wrong3 wrong4/; my @tests = ( { name => 'basic', regex => $italy, pass => [qw /no_prefix iso_prefix cept_prefix/], fail => [qw /own_prefix/, @wrongs], sub => \&RE_zip_Italy, }, { name => 'yes_prefix', regex => $yes_prefix, pass => [qw /iso_prefix cept_prefix/], fail => [qw /no_prefix own_prefix/, @wrongs], sub => \&RE_zip_Italy, sub_args => [-prefix => 'yes'], }, { name => 'no_prefix', regex => $no_prefix, pass => [qw /no_prefix/], fail => [qw /iso_prefix cept_prefix own_prefix/, @wrongs], sub => \&RE_zip_Italy, sub_args => [-prefix => 'no'], }, { name => 'iso_prefix', regex => $iso_prefix, pass => [qw /no_prefix iso_prefix/], fail => [qw /cept_prefix own_prefix/, @wrongs], sub => \&RE_zip_Italy, sub_args => [-country => 'iso'], }, { name => 'cept_prefix', regex => $cept_prefix, pass => [qw /no_prefix cept_prefix/], fail => [qw /iso_prefix own_prefix/, @wrongs], sub => \&RE_zip_Italy, sub_args => [-country => 'cept'], }, { name => 'own_prefix', regex => $own_prefix, pass => [qw /no_prefix own_prefix/], fail => [qw /iso_prefix cept_prefix/, @wrongs], sub => \&RE_zip_Italy, sub_args => [-country => 'it'], }, ); run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::zip', ; __END__ Regexp-Common-2016020301/t/zip/netherlands.t000755 000765 000024 00000020070 12116413566 021075 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use warnings; sub failures; use constant PASSES => 20; use constant FAIL => 10; my $count; my $normal = $RE {zip} {Netherlands}; my $no_space = $RE {zip} {Dutch} {-sep => ""}; my $dash = $RE {zip} {Netherlands} {-sep => "-"}; my $prefix = $RE {zip} {Dutch} {-prefix => "yes"}; my $no_prefix = $RE {zip} {Netherlands} {-prefix => "no"}; my $iso = $RE {zip} {Dutch} {-country => "iso"}; my $cept = $RE {zip} {Netherlands} {-country => "cept"}; my $country = $RE {zip} {Dutch} {-country => "NLD"}; my $dash_prefix = $dash -> {-prefix => "yes"}; my $dash_no_prefix = $dash -> {-prefix => "no"}; my @tests = ( [ normal => $normal => [qw /1 1 0 0 0 0 0/]], [ no_space => $no_space => [qw /0 0 1 1 0 0 0/]], [ dash => $dash => [qw /0 0 0 0 1 1 0/]], [ prefix => $prefix => [qw /0 1 0 0 0 0 0/]], ['no prefix' => $no_prefix => [qw /1 0 0 0 0 0 0/]], [ iso => $iso => [qw /1 1 0 0 0 0 0/]], [ cept => $cept => [qw /1 1 0 0 0 0 0/]], [ country => $country => [qw /1 0 0 0 0 0 1/]], ['dash & prefix' => $dash_prefix => [qw /0 0 0 0 0 1 0/]], ['dash & no prefix' => $dash_no_prefix => [qw /0 0 0 0 1 0 0/]], ); my @failures = failures; sub mess {print ++ $count, " - $_ (@_)\n"} sub pass {print "ok "; &mess} sub fail {print "not ok "; &mess} my $max = 1 + 2 * @tests * @{$tests [0] -> [2]} * PASSES + @failures * @tests; print "1..$max\n"; print "ok ", ++ $count, "\n"; sub run_test { my ($name, $re, $should_match) = @_; my $match = /^$re$/; my $line = $match ? "match" : "no match"; $line .= "; $name"; ($match xor $should_match) ? fail $line : pass $line } sub __ {map {defined () ? $_ : "UNDEF"} @_} sub run_keep { my ($name, $re, $should_match) = splice @_ => 0, 3; unless ($should_match) { if (/^$re$/) {fail "match; keep - $name"} else {pass "no match; keep - $name"} return; } my @exp = ($_, $_ [0], join ("" => grep {defined} @_ [1 .. 3]), @_ [1 .. 3]); if (my @args = /^$re$/) { unshift @_ => $_; unless (@exp == @args) { fail "match; keep - $name [@{[__ @args]}]"; } foreach my $n (0 .. $#_) { unless (!defined $exp [$n] && !defined $args [$n] || defined $exp [$n] && defined $args [$n] && $exp [$n] eq $args [$n]) { fail "match; keep - $name [@{[__ @args]}]"; return; } } pass "match; keep - $name"; return; } fail "no match; keep - $name"; } sub _n { my $min = $_ [0]; my $max = @_ > 1 ? $_ [1] : $_ [0]; my $x = 1 + int rand 9; $x .= int rand 10 for 2 .. $_ [0] + int rand (1 + $max - $min); $x; } sub _l { my $min = $_ [0]; my $max = @_ > 1 ? $_ [1] : $_ [0]; my @l = ('A' .. 'Z'); my $x = ""; $x .= $l [int rand @l] for 1 .. $_ [0] + int rand (1 + $max - $min); $x; } my %cache; foreach my $d (1 .. PASSES) { my $n = _n 4; my $l = _l 2; $l = _l 2 while $l =~ /[FIOQUY]/ || $l =~ /S[ADS]/; redo if $cache {"$n $l"}; my @t = ([undef, $n, " ", $l], ["NL", $n, " ", $l], [undef, $n, "", $l], ["NL", $n, "", $l], [undef, $n, "-", $l], ["NL", $n, "-", $l], ["NLD", $n, " ", $l]); my $c = 0; foreach my $t (@t) { local $_ = defined $t -> [0] ? $t -> [0] . "-" : ""; $_ .= join "" => grep {defined} @{$t} [1 .. 3]; foreach my $test (@tests) { my ($name, $re, $matches) = @$test; run_test $name, $re, $matches -> [$c]; run_keep $name, $re -> {-keep}, $matches -> [$c], @$t; } $c ++; } } foreach (@failures) { foreach my $test (@tests) { my ($name, $re) = @$test; /^$re$/ ? fail "match; $name" : pass "no match; $name"; } } sub failures { my @failures = ("", " "); # Zip starting with '0'. foreach (1 .. FAIL) { my $x = _n 4; $x =~ s/^./0/; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; redo if $cache {"$x $y"} ++; push @failures => "$x $y"; } # Too few numbers. foreach (1 .. FAIL) { my $x = _n 1, 3; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; redo if $cache {"$x $y"} ++; push @failures => "$x $y"; } # Too many numbers. foreach (1 .. FAIL) { my $x = _n 5, 10; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; redo if $cache {"$x $y"} ++; push @failures => "$x $y"; } # Too few letters. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 1; $y = _l 1 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; redo if $cache {"$x $y"} ++; push @failures => "$x $y"; } # Too many letters. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 3, 6; $y = _l 3, 6 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; redo if $cache {"$x $y"} ++; push @failures => "$x $y"; } # Wrong letters. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 2; $y = _l 2 until $y =~ /[FIOQUY]/; redo if $cache {"$x $y"} ++; push @failures => "$x $y"; } # Wrong letter combos. foreach (1 .. FAIL) { my $x = _n 4; my $y .= ('SA', 'SD', 'SS') [rand 3]; redo if $cache {"$x $y"} ++; push @failures => "$x $y"; } # Wrong separator. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; my $s = int rand 256; redo if +($s & 0x7F) < 0x20; $s = chr $s; redo if $s eq ' ' || $s eq '-'; redo if $cache {"$x$s$y"} ++; push @failures => "$x$s$y"; } # Lowercase letters. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 1; $y = _l 1 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; $y = lc $y; redo if $cache {"$x $y"} ++; push @failures => "$x $y"; } # Letters, then numbers. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; redo if $cache {"$y $x"} ++; push @failures => "$y $x"; } # Leading/trailing garbage. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; redo if $cache {" $x $y"} ++ or $cache {"$x $y "} ++; push @failures => " $x $y", "$x $y "; } push @failures => map {"NL-$_"} @failures; # Wrong countries. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; my $c = _l 2; $c = _l 2 while $c eq "NL"; redo if $cache {"$c-$x $y"} ++; push @failures => "$c-$x $y"; } # Lowercase countries. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; redo if $cache {"nl-$x $y"} ++; push @failures => "nl-$x $y"; } # Too many letters in country. foreach (1 .. FAIL) { my $x = _n 4; my $y .= _l 2; $y = _l 2 while $y =~ /[FIOQUY]/ || $y =~ /S[ADS]/; my $c = _l 1; $c = _l 1 while $c eq "D"; redo if $cache {"${c}NL-$x $y"} ++ || $cache {"NL$c-$x $y"} ++; push @failures => "${c}NL-$x $y"; push @failures => "NL$c-$x $y"; } @failures; }; __END__ Regexp-Common-2016020301/t/zip/norway.t000755 000765 000024 00000006764 12116413566 020123 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common qw /RE_zip_Norway/; use t::Common qw /run_new_tests cross pdd dd a/; use warnings; my $norway = $RE {zip} {Norway}; my $yes_prefix = $RE {zip} {Norway} {-prefix => 'yes'}; my $no_prefix = $RE {zip} {Norway} {-prefix => 'no'}; my $iso_prefix = $RE {zip} {Norway} {-country => 'iso'}; my $cept_prefix = $RE {zip} {Norway} {-country => 'cept'}; my $own_prefix = $RE {zip} {Norway} {-country => 'no'}; use constant PASS => 10; use constant FAIL => 10; my $valid = [ '0000', '9999', map {pdd 4} 1 .. -2 + PASS]; my $short = [ '0', '999', map {dd 1 => 3} 1 .. -2 + FAIL]; my $long = ['00000', '99999', map {dd 5 => 10} 1 .. -2 + FAIL]; my $letter = [map {my $z = dd 4; substr $z, rand (4), 1, a; $z} 1 .. FAIL]; my $wrong = [@$long, @$short, @$letter]; my %targets = ( no_prefix => { list => $valid, wanted => sub {$_, undef, $_ [0]}, }, iso_prefix => { list => $valid, query => sub {"NO-" . $_ [0]}, wanted => sub {$_, "NO", $_ [0]}, }, cept_prefix => { list => $valid, query => sub {"N-" . $_ [0]}, wanted => sub {$_, "N", $_ [0]}, }, own_prefix => { list => $valid, query => sub {"no-" . $_ [0]}, wanted => sub {$_, "no", $_ [0]}, }, wrong1 => { list => $wrong, }, wrong2 => { list => $wrong, query => sub {"NO-" . $_ [0]}, }, wrong3 => { list => $wrong, query => sub {"N-" . $_ [0]}, }, wrong4 => { list => $valid, query => sub {"NO " . $_ [0]}, }, ); my @wrongs = qw /wrong1 wrong2 wrong3 wrong4/; my @tests = ( { name => 'basic', regex => $norway, sub => \&RE_zip_Norway, pass => [qw /no_prefix iso_prefix cept_prefix/], fail => [qw /own_prefix/, @wrongs], }, { name => 'yes_prefix', regex => $yes_prefix, sub => \&RE_zip_Norway, sub_args => [-prefix => 'yes'], pass => [qw /iso_prefix cept_prefix/], fail => [qw /no_prefix own_prefix/, @wrongs], }, { name => 'no_prefix', regex => $no_prefix, sub => \&RE_zip_Norway, sub_args => [-prefix => 'no'], pass => [qw /no_prefix/], fail => [qw /iso_prefix cept_prefix own_prefix/, @wrongs], }, { name => 'iso_prefix', regex => $iso_prefix, sub => \&RE_zip_Norway, sub_args => [-country => 'iso'], pass => [qw /no_prefix iso_prefix/], fail => [qw /cept_prefix own_prefix/, @wrongs], }, { name => 'cept_prefix', regex => $cept_prefix, sub => \&RE_zip_Norway, sub_args => [-country => 'cept'], pass => [qw /no_prefix cept_prefix/], fail => [qw /iso_prefix own_prefix/, @wrongs], }, { name => 'own_prefix', regex => $own_prefix, sub => \&RE_zip_Norway, sub_args => [-country => 'no'], pass => [qw /no_prefix own_prefix/], fail => [qw /iso_prefix cept_prefix/, @wrongs], }, ); run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::zip', ; __END__ Regexp-Common-2016020301/t/zip/spain.t000755 000765 000024 00000012112 12116413566 017676 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use warnings; sub passes; sub failures; use constant PASSES => 20; use constant FAIL => 5; my $normal = $RE {zip} {Spain}; my $prefix = $RE {zip} {Spain} {-prefix => 'yes'}; my $no_prefix = $RE {zip} {Spain} {-prefix => 'no'}; my $iso = $RE {zip} {Spain} {-country => "iso"}; my $cept = $RE {zip} {Spain} {-country => "cept"}; my $country = $RE {zip} {Spain} {-country => "ES"}; my $iso_prefix = $iso -> {-prefix => 'yes'}; my $cept_prefix = $cept -> {-prefix => 'yes'}; my @tests = ( [ normal => $normal => [qw /1 1 1/]], [ prefix => $prefix => [qw /0 1 1/]], ['no prefix' => $no_prefix => [qw /1 0 0/]], [ iso => $iso => [qw /1 0 1/]], [ cept => $cept => [qw /1 1 0/]], [ country => $country => [qw /1 0 1/]], ['iso prefix' => $iso_prefix => [qw /0 0 1/]], ['cept prefix' => $cept_prefix => [qw /0 1 0/]], ); my @failures = failures; my $count; sub mess {print ++ $count, " - $_ (@_)\n"} sub pass {print "ok "; &mess} sub fail {print "not ok "; &mess} my $m = 0; my $k = 0; foreach my $test (@tests) { $m ++ foreach @{$test -> [2]}; $k ++ foreach grep {$_} @{$test -> [2]}; } my $max = 1; $max += PASSES * $m; $max += PASSES * $k; $max += @failures * @tests; print "1..$max\n"; print "ok ", ++ $count, "\n"; sub run_test { my ($name, $re, $should_match) = @_; my $match = "<<$_>>" =~ /$re/; my $good = $match && $_ eq $&; my $line = $good ? "match" : $match ? "wrong match (got: $&)" : "no match"; $line .= "; $name"; if ($should_match) {$good ? pass $line : fail $line} else {$good ? fail $line : pass $line} } sub array_cmp { my ($a1, $a2) = @_; return 0 unless @$a1 eq @$a2; foreach my $i (0 .. $#$a1) { !defined $$a1 [$i] && !defined $$a2 [$i] || defined $$a1 [$i] && defined $$a2 [$i] && $$a1 [$i] eq $$a2 [$i] or return 0; } return 1; } sub __ {map {defined () ? $_ : "UNDEF"} @_} sub run_keep { my ($name, $re, $parts) = @_; my @chunks = /^$re->{-keep}$/; unless (@chunks) {fail "no match; $name - keep"; return} array_cmp (\@chunks, [$_ => @$parts]) ? pass "match; $name - keep" : fail "wrong match [@{[__ @chunks]}]; $name - keep" } sub _ { my $min = $_ [0]; my $max = @_ > 1 ? $_ [1] : $_ [0]; my $x = ""; $x .= int rand 10 for 1 .. $_ [0] + int rand (1 + $max - $min); $x; } my %cache; foreach my $d (1 .. PASSES) { my ($x, $y, $z) = qw /01 0 00/; while ($cache {"$x$y$z"} ++) { $x = _ 2; redo unless 00 < $x && $x <= 52; $y = _ 1; $z = _ 2; } my @t = ([undef, "$x$y$z", $x, $y, $z], ["E", "$x$y$z", $x, $y, $z], ["ES", "$x$y$z", $x, $y, $z]); my $c = 0; foreach my $t (@t) { local $_ = defined $t -> [0] ? $t -> [0] . "-" : ""; $_ .= join "" => @$t [2 .. 4]; foreach my $test (@tests) { my ($name, $re, $matches) = @$test; run_test $name, $re, $matches -> [$c]; run_keep $name, $re -> {-keep}, $t if $matches -> [$c]; } $c ++; } } foreach (@failures) { foreach my $test (@tests) { my ($name, $re) = @$test; /^$re$/ ? fail "match; $name" : pass "no match; $name"; } } sub failures { my @failures = ("", " "); # Too short. push @failures => 0 .. 9; for (1 .. FAIL) { my $x = _ 2, 4; redo if $cache {$x} ++; push @failures => $x; } # Too long. for (1 .. FAIL) { my $x = _ 6, 10; redo if $cache {$x} ++; push @failures => $x; } for my $c ('.', ';', '-', ' ', '+') { for (1 .. FAIL) { my $x = _ 4; $x .= $c; redo if $cache {$x} ++; push @failures => $x; } for (1 .. FAIL) { my $x = _ 4; $x = "$c$x"; redo if $cache {$x} ++; push @failures => $x; } } # Wrong provinces. for (1 .. FAIL) { my $x = _ 2; redo if $x <= 52; my $y = _ 3; redo if $cache {"$x$y"}; push @failures => "$x$y"; } push @failures => "00" . _ 3; # Same failures, with country in front of it as well. push @failures => map {("ES-$_", "E-$_")} @failures; # Wrong countries. for (1 .. FAIL) { my $c = join "" => map {('A' .. 'Z') [rand 26]} 1 .. 2; redo if $c eq "ES" || $c eq "E" || $cache {$c} ++; my $x = _ 5; push @failures => "$c-$x"; } for (1 .. FAIL) { my $c = ('A' .. 'Z') [rand 26]; redo if $cache {$c} ++; my $x = _ 5; push @failures => "${c}ES-$x"; push @failures => "ES$c-$x"; } for (1 .. FAIL) { my $x = _ 5; redo if $cache {"es-$x"} ++; push @failures => "es-$x", "e-$x"; } @failures; } __END__ Regexp-Common-2016020301/t/zip/us.t000755 000765 000024 00000025434 12116413566 017226 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common qw /RE_zip_US/; use t::Common qw /run_new_tests cross gimme sample pdd/; use warnings; my $basic = $RE {zip} {US}; my $ext_yes = $RE {zip} {US} {-extended => 'yes'}; my $ext_no = $RE {zip} {US} {-extended => 'no'}; my $prf_yes = $RE {zip} {US} {-prefix => 'yes'}; my $prf_no = $RE {zip} {US} {-prefix => 'no'}; my $sep_sp = $basic -> {-sep => " "}; my $sep_dsh = $basic -> {-sep => "--"}; my $sep_rg = $basic -> {-sep => "[- ]"}; my $iso = $RE {zip} {US} {-country => 'iso'}; my $cept = $RE {zip} {US} {-country => 'cept'}; my $usa = $RE {zip} {US} {-country => 'usa'}; my $iso_py = $iso -> {-prefix => 'yes'}; my $iso_pn = $iso -> {-prefix => 'no'}; my $cept_py = $cept -> {-prefix => 'yes'}; my $cept_pn = $cept -> {-prefix => 'no'}; my $all = $RE {zip} {US} {-country => 'iso'} {-prefix => 'yes'} {-extended => 'yes'} {-sep => '[- ]'}; my @zips = ("00000", gimme 10, sub {pdd 5}); my @ext = ("0000", gimme 5, sub {pdd 4}); my @zip_ext = (["00000", "0000"], cross (["00000"], [gimme 2 => sub {pdd 4}]), cross ([gimme 2 => sub {pdd 5}], ["0000"]), sample 10 => cross [gimme 5 => sub {pdd 5}], [gimme 5 => sub {pdd 4}]); my @bad_zip = ("0000", "000000", gimme (10 => sub {pdd 2, 4}), # Too short. gimme (10 => sub {pdd 6, 8})); # Too long. my @bad_ext = ("000", "0000", gimme (10 => sub {pdd 1, 3}), # Too short. gimme (10 => sub {pdd 5, 8})); # Too long. my @baddies = @bad_zip; # Basic bad zips. push @baddies => map {join "-" => @_} sample 10 => cross \@zips, \@bad_ext; # Bad extensions. push @baddies => map {join ["\n", qw {_ ! & ---}] -> [rand 5] => @_} sample 10 => cross \@zips, \@ext; # Bad separator. push @baddies => map {"USS-$_"} @zips; # Bad countries. my (@tests, %targets); $targets {simple} = { list => \@zips, wanted => sub {$_, undef, join ("-" => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), undef, undef, undef, undef,}, }; $targets {simple_USA} = { list => [@zips [0 .. 4]], query => sub {join "-" => "USA", $_ [0]}, wanted => sub {$_, "USA", join ("-" => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), undef, undef, undef, undef,}, }; $targets {simple_US} = { list => [@zips [0 .. 4]], query => sub {join "-" => "US", $_ [0]}, wanted => sub {$_, "US", join ("-" => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), undef, undef, undef, undef,}, }; $targets {simple_usa} = { list => [@zips [0 .. 4]], query => sub {join "-" => "usa", $_ [0]}, wanted => sub {$_, "usa", join ("-" => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), undef, undef, undef, undef,}, }; $targets {extended} = { list => \@zip_ext, query => sub {join "-" => @_}, wanted => sub {$_, undef, join ("-" => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), "-", $_ [1], substr ($_ [1], 0, 2), substr ($_ [1], 2, 2)}, }; $targets {extended_USA} = { list => \@zip_ext, query => sub {join "-" => "USA", @_}, wanted => sub {$_, "USA", join ("-" => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), "-", $_ [1], substr ($_ [1], 0, 2), substr ($_ [1], 2, 2)}, }; $targets {extended_US} = { list => \@zip_ext, query => sub {join "-" => "US", @_}, wanted => sub {$_, "US", join ("-" => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), "-", $_ [1], substr ($_ [1], 0, 2), substr ($_ [1], 2, 2)}, }; $targets {extended_US_sp} = { list => \@zip_ext, query => sub {"US-" . $_ [0] . " " . $_ [1]}, wanted => sub {$_, "US", join (" " => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), " ", $_ [1], substr ($_ [1], 0, 2), substr ($_ [1], 2, 2)}, }; $targets {sep_sp} = { list => \@zip_ext, query => sub {join " " => @_}, wanted => sub {$_, undef, join (" " => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), " ", $_ [1], substr ($_ [1], 0, 2), substr ($_ [1], 2, 2)}, }; $targets {sep_dashes} = { list => \@zip_ext, query => sub {join "--" => @_}, wanted => sub {$_, undef, join ("--" => @_), $_ [0], substr ($_ [0], 0, 3), substr ($_ [0], 3, 2), "--", $_ [1], substr ($_ [1], 0, 2), substr ($_ [1], 2, 2)}, }; $targets {bad_zip} = { list => \@baddies, }; push @tests => { name => 'basic', regex => $basic, sub => \&RE_zip_US, pass => [qw /simple simple_USA simple_US extended extended_USA extended_US/], fail => [qw /bad_zip sep_sp sep_dashes simple_usa extended_US_sp/], }; push @tests => { name => 'usa', regex => $usa, sub => \&RE_zip_US, sub_args => [-country => 'usa'], pass => [qw /simple simple_usa extended/], fail => [qw /bad_zip sep_sp sep_dashes simple_USA extended_USA simple_US extended_US extended_US_sp/], }; push @tests => { name => 'iso', regex => $iso, sub => \&RE_zip_US, sub_args => [-country => 'iso'], pass => [qw /simple simple_US extended extended_US/], fail => [qw /bad_zip sep_sp sep_dashes simple_USA extended_USA simple_usa extended_US_sp/], }; push @tests => { name => 'iso_py', regex => $iso_py, sub => \&RE_zip_US, sub_args => [-country => 'iso', -prefix => 'yes'], pass => [qw /simple_US extended_US/], fail => [qw /bad_zip sep_sp sep_dashes simple_USA extended_USA extended_US_sp simple extended simple_usa/], }; push @tests => { name => 'iso_pn', regex => $iso_pn, sub => \&RE_zip_US, sub_args => [-country => 'iso', -prefix => 'no'], pass => [qw /simple extended/], fail => [qw /bad_zip sep_sp sep_dashes simple_USA extended_USA extended_US extended_US_sp simple_US simple_usa/], }; push @tests => { name => 'cept_py', regex => $cept_py, sub => \&RE_zip_US, sub_args => [-country => 'cept', -prefix => 'yes'], pass => [qw /simple_USA extended_USA/], fail => [qw /bad_zip sep_sp sep_dashes simple_US simple extended extended_US extended_US_sp simple_usa/], }; push @tests => { name => 'cept_pn', regex => $cept_pn, sub => \&RE_zip_US, sub_args => [-country => 'cept', -prefix => 'no'], pass => [qw /simple extended/], fail => [qw /bad_zip sep_sp sep_dashes simple_USA extended_USA extended_US extended_US_sp simple_US simple_usa/], }; push @tests => { name => 'cept', regex => $cept, sub => \&RE_zip_US, sub_args => [-country => 'cept'], pass => [qw /simple simple_USA extended extended_USA/], fail => [qw /bad_zip sep_sp sep_dashes simple_US simple_usa extended_US extended_US_sp/], }; push @tests => { name => 'ext_yes', regex => $ext_yes, sub => \&RE_zip_US, sub_args => [-extended => 'yes'], pass => [qw /extended extended_USA extended_US/], fail => [qw /simple simple_USA simple_US bad_zip sep_sp sep_dashes simple_usa extended_USA_sp/], }; push @tests => { name => 'ext_no', regex => $ext_no, sub => \&RE_zip_US, sub_args => [-extended => 'no'], pass => [qw /simple simple_USA simple_US/], fail => [qw /extended extended_USA bad_zip sep_sp sep_dashes simple_usa extended_US_sp extended_US/], }; push @tests => { name => 'prf_yes', regex => $prf_yes, sub => \&RE_zip_US, sub_args => [-prefix => 'yes'], pass => [qw /simple_USA simple_US extended_USA extended_US/], fail => [qw /simple extended bad_zip sep_sp sep_dashes simple_usa extended_US_sp/], }; push @tests => { name => 'prf_no', regex => $prf_no, sub => \&RE_zip_US, sub_args => [-prefix => 'no'], pass => [qw /simple extended/], fail => [qw /simple_USA simple_US extended_USA bad_zip sep_sp extended_US extended_US_sp sep_dashes simple_usa/], }; push @tests => { name => 'sep space', regex => $sep_sp, sub => \&RE_zip_US, sub_args => [-sep => ' '], pass => [qw /simple simple_USA simple_US sep_sp extended_US_sp/], fail => [qw /bad_zip sep_dashes extended extended_USA extended_US simple_usa/], }; push @tests => { name => 'sep dashes', regex => $sep_dsh, sub => \&RE_zip_US, sub_args => [-sep => '--'], pass => [qw /simple simple_USA simple_US sep_dashes/], fail => [qw /bad_zip sep_sp extended extended_USA simple_usa extended_US extended_US_sp/], }; push @tests => { name => 'sep regex', regex => $sep_rg, sub => \&RE_zip_US, sub_args => [-sep => '[- ]'], pass => [qw /simple simple_USA simple_US sep_sp extended extended_USA extended_US extended_US_sp/], fail => [qw /bad_zip sep_dashes simple_usa/], }; push @tests => { name => 'all', regex => $all, sub => \&RE_zip_US, sub_args => [-country => 'iso', -prefix => 'yes', -extended => 'yes', -sep => '[- ]'], pass => [qw /extended_US extended_US_sp/], fail => [qw /simple simple_USA simple_US bad_zip sep_sp sep_dashes simple_usa extended extended_USA/], }; run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::zip', version => 5.00503, ; __END__ Regexp-Common-2016020301/t/zip/zip.t000755 000765 000024 00000004332 12116413566 017373 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use warnings; my @tests__ = ("", "\n", "hello, world"); my %tests_t = ( "{1,1}" => [qw [y yes Y YES YeLLow]], "{0,0}" => [qw [n no N NO Nano]], "{0,1}" => [qw [R blue maroon], "\n", "", " ", undef, "\nn"], ); # # Cut and paste from Regexp::Common::zip # my %code = ( Australia => [qw /AUS? AU AUS/], Belgium => [qw /BE? BE B/], Denmark => [qw /DK DK DK/], France => [qw /FR? FR F/], Germany => [qw /DE? DE D/], Greenland => [qw /DK DK DK/], Italy => [qw /IT? IT I/], Netherlands => [qw /NL NL NL/], Norway => [qw /NO? NO N/], Spain => [qw /ES? ES E/], USA => [qw /USA? US USA/], ); my $tests = @tests__ + 2; $tests += @$_ for values %tests_t; $tests += 1; $tests += keys %code; print "1..$tests\n"; my $count = 0; # # Test the __ subroutine. # foreach my $test (@tests__) { my $ret = Regexp::Common::zip::__ $test; printf "%s %d\n" => defined $ret && $ret eq $test ? "ok" : "not ok", ++ $count; } my $ret1 = Regexp::Common::zip::__ undef; my $ret2 = Regexp::Common::zip::__; printf "%s %d\n" => defined $ret1 && $ret1 eq "" ? "ok" : "not ok", ++ $count; printf "%s %d\n" => defined $ret2 && $ret2 eq "" ? "ok" : "not ok", ++ $count; # # Test the _t subroutine # while (my ($ret, $tests) = each %tests_t) { foreach my $test (@$tests) { my $r = Regexp::Common::zip::_t $test; printf "%s %d\n" => defined $r && $r eq $ret ? "ok" : "not ok", ++ $count; } } my $r = Regexp::Common::zip::_t; printf "%s %d\n" => defined $r && $r eq "{0,1}" ? "ok" : "not ok", ++ $count; # # Test the _c subroutine - we don't have to test all the possible # returned values - that's already done from the various country # specific tests. In fact, all we need to test is giving an # undefined second parameter. # while (my ($name, $codes) = each %code) { my $r = Regexp::Common::zip::_c $name; printf "%s %d\n" => defined $r && $r eq $$codes [0] ? "ok" : "not ok", ++ $count; } __END__ Regexp-Common-2016020301/t/URI/any.t000755 000765 000024 00000012167 12116413566 017222 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; # TEST URIs try $RE{URI}; for my $scheme (qw [http https]) { pass "$scheme://www.example.com"; pass "$scheme://www.example.com/"; pass "$scheme://www.example.com/some/file/some/where"; pass "$scheme://www.example.com/some/directory/some/where"; pass "$scheme://www.example.com:80/some/file"; pass "$scheme://127.0.0.1"; pass "$scheme://127.0.0.1/"; pass "$scheme://127.0.0.1:12345/some/file"; pass "$scheme://www.example.com:80/some/path?query"; pass "$scheme://www.example.com/%7Eabigail/"; # Test "safe" chars. pass "$scheme://www.example.com/--_\$.+++"; pass "$scheme://www.example.com/."; # Test "extra" chars. pass "$scheme://www.example.com/**!(),,''"; # Test HTTP additional chars. pass "$scheme://www.example.com/:;\@=&=;"; pass "$scheme://www.example.com/some/path?query"; pass "$scheme://www.example.com/some/path?funny**!(),,:;\@=&="; pass "$scheme://www.example.com/some/?"; pass "$scheme://www.example.com/?"; pass "$scheme://www.example.com//////////////"; # Usernames/passwords are NOT allowed in http URIs. fail "$scheme://abigail\@www.example.com"; fail "$scheme://abigail\@www.example.com:80/some/file"; fail "$scheme://abigail:secret\@www.example.com:80/some/file"; fail "$scheme://abigail:secret\@127.0.0.1:80/some/file"; # ~ was NOT allowed by RFC 1738, but currently is. pass "$scheme://www.example.com/~abigail/"; # Fail on "national" characters. fail "$scheme://www.example.com/nope|nope"; fail "$scheme://www.example.com/`"; # Fail on "punctation" characters. fail "$scheme://www.example.com/some/file#target"; # Two question marks used to be failure, but is now allowed. pass "$scheme://www.example.com/some/path?query1?query2"; pass "$scheme://www.example.com/some/??"; # Can have slashes in query. pass "$scheme://www.example.com/some/path?query/path"; } # Scheme must be lower case, and correct. fail 'HTTP://www.example.com/'; fail 'HTTPS://www.example.com/'; pass 'ftp://ftp.example.com'; pass 'ftp://ftp.example.com/'; pass 'ftp://ftp.example.com/some/file/some/where'; pass 'ftp://ftp.example.com/some/directory/some/where/'; pass 'ftp://ftp.example.com:21/some/file'; pass 'ftp://127.0.0.1'; pass 'ftp://127.0.0.1/'; pass 'ftp://127.0.0.1:12345/some/file'; pass 'ftp://ftp.example.com/%7Eabigail/'; fail 'ftp://ftp.example.com:21/some/path?query'; # Test "safe" chars. pass 'ftp://ftp.example.com/--_$.+++'; pass 'ftp://ftp.example.com/.'; # Test "extra" chars. pass "ftp://ftp.example.com/**!(),,''"; # Test URI additional chars. pass 'ftp://www.example.com/:@=&='; pass 'ftp://www.example.com//////////////'; # Should fail on ';'. fail 'ftp://www.example.com/some/path;here'; # Usernames/passwords are allowed in ftp URIs. pass 'ftp://abigail@ftp.example.com'; pass 'ftp://abigail@ftp.example.com:21/some/file'; pass 'ftp://abigail:secret@ftp.example.com:21/some/file'; pass 'ftp://abigail:secret@127.0.0.1:21/some/file'; pass 'ftp://abigail:secret:here@127.0.0.1:21/some/file'; # ~ was NOT allowed by RFC 1738, but currently is. pass 'ftp://ftp.example.com/~abigail/'; # Fail on "national" characters. fail 'ftp://ftp.example.com/nope|nope'; fail 'ftp://ftp.example.com/`'; # Fail on "punctation" characters. fail 'ftp://www.example.com/some/file#target'; # Cannot have queries. fail 'ftp://ftp.example.com/some/path?query1?query2'; fail 'ftp://ftp.example.com/some/??'; fail 'ftp://ftp.example.com/some/path?query/path'; # Test type. pass 'ftp://ftp.example.com/some/path;type=A'; pass 'ftp://ftp.example.com/some/path;type=i'; pass 'ftp://abigail@ftp.example.com/some/path/somewhere;type=a', fail 'ftp://ftp.example.com/some/path;type=Q'; fail 'ftp://ftp.example.com/some/path;type=AI'; pass 'ftp://ftp.example.com/;type=I'; # Scheme must be lower case, and correct. fail 'HTTP://ftp.example.com/'; fail 'FTP://ftp.example.com/'; fail 'feeble://ftp.example.com/'; pass 'tel:+12345'; pass 'tel:+358-555-1234567'; pass 'tel:456-7890;phone-context=213'; pass 'tel:456-7890;phone-context=X-COMPANY-NET'; pass 'tel:+1-212-555-1234;tsp=terrifictelecom.com'; pass 'tel:+1-212-555-1234;tsp=terrifictelecom.com;phone-context=X-COMPANY-NET'; pass 'tel:+358-555-1234567;postd=pp22'; pass 'tel:0w003585551234567;phone-context=+3585551234'; pass 'tel:+1234567890;phone-context=+1234;vnd.company.option=foo'; pass 'tel:+1234567890;phone-context=+1234;vnd.company.option=%22foo%22'; pass 'tel:+1234;option=%22!%22'; pass 'tel:+1234;option=%22%5C%22%22'; pass 'tel:+1234;option=%22%5C!%22'; pass 'tel:+1234;option=%22bar%22'; pass 'tel:+456-7890;phone-context=213;phone-context=213'; pass 'tel:456-7890;phone-context=213;phone-context=213'; fail 'tel:456-7890'; fail 'tel:+1-800-RUN-PERL'; fail 'tel:+1234;option=%22%22%22'; fail 'tel:+1234;option=%22%5C%22'; pass 'tel:+123-456-789;isub=123(456)'; pass 'tel:+123456;postd=***'; # RT 52309 This 'hangs' pass 'news:comp.infosystems.www.servers.unix'; Regexp-Common-2016020301/t/URI/fax.t000755 000765 000024 00000006645 12116413566 017215 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; # TEST URIs try $RE{URI}{fax}; pass 'fax:+12345'; pass 'fax:+358-555-1234567'; pass 'fax:456-7890;phone-context=213'; pass 'fax:456-7890;phone-context=X-COMPANY-NET'; pass 'fax:+1-212-555-1234;tsp=terrifictelecom.com'; pass 'fax:+1-212-555-1234;tsp=terrifictelecom.com;phone-context=X-COMPANY-NET'; pass 'fax:+358-555-1234567;postd=pp22'; pass 'fax:0w003585551234567;phone-context=+3585551234'; pass 'fax:+1234567890;phone-context=+1234;vnd.company.option=foo'; pass 'fax:+1234567890;phone-context=+1234;vnd.company.option=%22foo%22'; pass 'fax:+1234;option=%22!%22'; pass 'fax:+1234;option=%22%5C%22%22'; pass 'fax:+1234;option=%22%5C!%22'; pass 'fax:+1234;option=%22bar%22'; pass 'fax:+456-7890;phone-context=213;phone-context=213'; pass 'fax:456-7890;phone-context=213;phone-context=213'; pass 'fax:+12345;tsub=0123456789-.()'; pass 'fax:+358-555-123456;tsub=0123456789-.()7'; pass 'fax:456-7890;tsub=0123456789-.();phone-context=213'; pass 'fax:456-7890;tsub=0123456789-.();phone-context=X-COMPANY-NET'; pass 'fax:+1-212-555-1234;tsub=0123456789-.();tsp=terrifictelecom.com;phone-context=X-COMPANY-NET'; fail 'fax:456-7890'; fail 'fax:+1-800-RUN-PERL'; fail 'fax:+1234;option=%22%22%22'; fail 'fax:+1234;option=%22%5C%22'; pass 'fax:+123-456-789;isub=123(456)'; pass 'fax:+123456;postd=***'; fail 'fax:456-7890;phone-context=213;tsub=0123456789-.()'; fail 'fax:456-7890;tsub=213;tsub=456'; fail 'fax:456-7890;tsub=213;'; try $RE{URI}{fax}{nofuture}; pass 'fax:+12345'; pass 'fax:+358-555-1234567'; pass 'fax:456-7890;phone-context=213'; pass 'fax:456-7890;phone-context=X-COMPANY-NET'; pass 'fax:+1-212-555-1234;tsp=terrifictelecom.com'; pass 'fax:+1-212-555-1234;tsp=terrifictelecom.com;phone-context=X-COMPANY-NET'; pass 'fax:+358-555-1234567;postd=pp22'; pass 'fax:0w003585551234567;phone-context=+3585551234'; fail 'fax:+1234567890;phone-context=+1234;vnd.company.option=foo'; fail 'fax:+1234567890;phone-context=+1234;vnd.company.option=%22foo%22'; fail 'fax:+1234;option=%22!%22'; fail 'fax:+1234;option=%22%5C%22%22'; fail 'fax:+1234;option=%22%5C!%22'; fail 'fax:+1234;option=%22bar%22'; pass 'fax:+456-7890;phone-context=213;phone-context=213'; pass 'fax:456-7890;phone-context=213;phone-context=213'; fail 'fax:456-7890'; fail 'fax:+1-800-RUN-PERL'; fail 'fax:+1234;option=%22%22%22'; fail 'fax:+1234;option=%22%5C%22'; fail 'fax:+358-555-1234567;phone-context=+1234;postd=pp22'; pass 'fax:+123-456-789;isub=123(456)'; fail 'fax:+123-456-789;isub=123(456);isub=123(456)'; fail 'fax:+123-456-789;isub=A23(456)'; pass 'fax:+123456;postd=***'; fail 'fax:1234567890;phone-context=+1234;vnd.company.option=foo'; fail 'fax:1234567890;phone-context=+1234;vnd.company.option=%22foo%22'; fail 'fax:1234;option=%22!%22'; fail 'fax:1234;option=%22%5C%22%22'; fail 'fax:1234;option=%22%5C!%22'; fail 'fax:1234;option=%22bar%22'; fail 'fax:+12345;tsub=foo'; fail 'fax:456-7890;tsub=213;tsub=456'; fail 'fax:456-7890;tsub=213;'; pass 'fax:456-7890;tsub=0123456789-.();phone-context=213'; pass 'fax:456-7890;tsub=0123456789-.();phone-context=X-COMPANY-NET'; pass 'fax:+1-212-555-1234;tsub=0123456789-.();tsp=terrifictelecom.com;phone-context=X-COMPANY-NET'; Regexp-Common-2016020301/t/URI/file.t000755 000765 000024 00000003272 12116413566 017347 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; sub create_parts; my $file = $RE {URI} {file}; my @tests = ( [file => $file => {file => NORMAL_PASS | FAIL}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI::file", tests => \@tests, good => $good, bad => $bad, query => \&file, wanted => \&wanted; sub file { my ($tag, $host, $path) = ($_ [0], @{$_ [1]}); my $file = "file://"; $file .= $host if defined $host; $file .= "/$path" if defined $path; $file; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = "file"; $wanted [2] = $$parts [0]; $wanted [2] .= "/" . $$parts [1] if defined $$parts [1]; $wanted [3] = $$parts [0]; $wanted [4] = "/" . $$parts [1] if defined $$parts [1]; $wanted [5] = $$parts [1]; \@wanted; } sub create_parts { my (@good, @bad); # Hosts. $good [0] = ["", qw /www.abigail.be www.PERL.com a.b.c.d.e.f.g.h.i.j.k.x 127.0.0.1 w--w--w.abigail.be w3.abigail.be/]; $bad [0] = [qw /www.example..com w+w.example.com w--.example.com 127.0.1 127.0.0.0.1 -w.example.com www.example.1com/]; # Paths. $good [1] = ["", qw {foo foo/bar/baz/bingo foo%00bar foo%EFbar %12%34%E6%7B %12%34/%E6%7B %12%34%E6%7B/foo ()() fnurd&.!@}]; $bad [1] = [undef, qw {foo<> foo<>bar ~abigail %GGfoo foo%F %FOfoo}, '#hubba']; return (\@good, \@bad); } __END__ Regexp-Common-2016020301/t/URI/ftp.t000755 000765 000024 00000014034 12116413566 017217 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; # TEST URIs try $RE{URI}{FTP}; pass 'ftp://ftp.example.com'; pass 'ftp://ftp.example.com/'; pass 'ftp://ftp.example.com/some/file/some/where'; pass 'ftp://ftp.example.com/some/directory/some/where/'; pass 'ftp://ftp.example.com:21/some/file'; pass 'ftp://127.0.0.1'; pass 'ftp://127.0.0.1/'; pass 'ftp://127.0.0.1:12345/some/file'; pass 'ftp://ftp.example.com/%7Eabigail/'; fail 'ftp://ftp.example.com:21/some/path?query'; # Test "safe" chars. pass 'ftp://ftp.example.com/--_$.+++'; pass 'ftp://ftp.example.com/.'; # Test "extra" chars. pass "ftp://ftp.example.com/**!(),,''"; # Test URI additional chars. pass 'ftp://www.example.com/:@=&='; pass 'ftp://www.example.com//////////////'; # Should fail on ';'. fail 'ftp://www.example.com/some/path;here'; # Usernames/passwords are allowed in ftp URIs. pass 'ftp://abigail@ftp.example.com'; pass 'ftp://abigail@ftp.example.com:21/some/file'; pass 'ftp://abigail:secret@ftp.example.com:21/some/file'; pass 'ftp://abigail:secret@127.0.0.1:21/some/file'; pass 'ftp://abigail:secret:here@127.0.0.1:21/some/file'; # ~ was NOT allowed by RFC 1738, but currently is. pass 'ftp://ftp.example.com/~abigail/'; # Fail on "national" characters. fail 'ftp://ftp.example.com/nope|nope'; fail 'ftp://ftp.example.com/`'; # Fail on "punctation" characters. fail 'ftp://www.example.com/some/file#target'; # Cannot have queries. fail 'ftp://ftp.example.com/some/path?query1?query2'; fail 'ftp://ftp.example.com/some/??'; fail 'ftp://ftp.example.com/some/path?query/path'; # Test type. pass 'ftp://ftp.example.com/some/path;type=A'; pass 'ftp://ftp.example.com/some/path;type=i'; pass 'ftp://abigail@ftp.example.com/some/path/somewhere;type=a', fail 'ftp://ftp.example.com/some/path;type=Q'; fail 'ftp://ftp.example.com/some/path;type=AI'; pass 'ftp://ftp.example.com/;type=I'; # Scheme must be lower case, and correct. fail 'HTTP://ftp.example.com/'; fail 'FTP://ftp.example.com/'; fail 'http://ftp.example.com/'; try $RE{URI}{FTP}{-password}; pass 'ftp://ftp.example.com'; pass 'ftp://ftp.example.com/'; pass 'ftp://ftp.example.com/some/file/some/where'; pass 'ftp://ftp.example.com/some/directory/some/where/'; pass 'ftp://ftp.example.com:21/some/file'; pass 'ftp://127.0.0.1'; pass 'ftp://127.0.0.1/'; pass 'ftp://127.0.0.1:12345/some/file'; pass 'ftp://ftp.example.com/%7Eabigail/'; fail 'ftp://ftp.example.com:21/some/path?query'; # Test "safe" chars. pass 'ftp://ftp.example.com/--_$.+++'; pass 'ftp://ftp.example.com/.'; # Test "extra" chars. pass "ftp://ftp.example.com/**!(),,''"; # Test URI additional chars. pass 'ftp://www.example.com/:@=&='; pass 'ftp://www.example.com//////////////'; # Should fail on ';'. fail 'ftp://www.example.com/some/path;here'; # Usernames/passwords are allowed in ftp URIs. pass 'ftp://abigail@ftp.example.com'; pass 'ftp://abigail@ftp.example.com:21/some/file'; pass 'ftp://abigail:secret@ftp.example.com:21/some/file'; pass 'ftp://abigail:secret@127.0.0.1:21/some/file'; fail 'ftp://abigail:secret:here@127.0.0.1:21/some/file'; # ~ was NOT allowed by RFC 1738, but currently is. pass 'ftp://ftp.example.com/~abigail/'; # Fail on "national" characters. fail 'ftp://ftp.example.com/nope|nope'; fail 'ftp://ftp.example.com/`'; # Fail on "punctation" characters. fail 'ftp://www.example.com/some/file#target'; # Cannot have queries. fail 'ftp://ftp.example.com/some/path?query1?query2'; fail 'ftp://ftp.example.com/some/??'; fail 'ftp://ftp.example.com/some/path?query/path'; # Test type. pass 'ftp://ftp.example.com/some/path;type=A'; pass 'ftp://ftp.example.com/some/path;type=i'; pass 'ftp://abigail@ftp.example.com/some/path/somewhere;type=a', fail 'ftp://ftp.example.com/some/path;type=Q'; fail 'ftp://ftp.example.com/some/path;type=AI'; pass 'ftp://ftp.example.com/;type=I'; # Scheme must be lower case, and correct. fail 'HTTP://ftp.example.com/'; fail 'FTP://ftp.example.com/'; fail 'http://ftp.example.com/'; try $RE{URI}{FTP}{"-type" => "[AIDaid]"}; pass 'ftp://ftp.example.com'; pass 'ftp://ftp.example.com/'; pass 'ftp://ftp.example.com/some/file/some/where'; pass 'ftp://ftp.example.com/some/directory/some/where/'; pass 'ftp://ftp.example.com:21/some/file'; pass 'ftp://127.0.0.1'; pass 'ftp://127.0.0.1/'; pass 'ftp://127.0.0.1:12345/some/file'; pass 'ftp://ftp.example.com/%7Eabigail/'; fail 'ftp://ftp.example.com:21/some/path?query'; # Test "safe" chars. pass 'ftp://ftp.example.com/--_$.+++'; pass 'ftp://ftp.example.com/.'; # Test "extra" chars. pass "ftp://ftp.example.com/**!(),,''"; # Test URI additional chars. pass 'ftp://www.example.com/:@=&='; pass 'ftp://www.example.com//////////////'; # Should fail on ';'. fail 'ftp://www.example.com/some/path;here'; # Usernames/passwords are allowed in ftp URIs. pass 'ftp://abigail@ftp.example.com'; pass 'ftp://abigail@ftp.example.com:21/some/file'; pass 'ftp://abigail:secret@ftp.example.com:21/some/file'; pass 'ftp://abigail:secret@127.0.0.1:21/some/file'; pass 'ftp://abigail:secret:here@127.0.0.1:21/some/file'; # ~ was NOT allowed by RFC 1738, but currently is. pass 'ftp://ftp.example.com/~abigail/'; # Fail on "national" characters. fail 'ftp://ftp.example.com/nope|nope'; fail 'ftp://ftp.example.com/`'; # Fail on "punctation" characters. fail 'ftp://www.example.com/some/file#target'; # Cannot have queries. fail 'ftp://ftp.example.com/some/path?query1?query2'; fail 'ftp://ftp.example.com/some/??'; fail 'ftp://ftp.example.com/some/path?query/path'; # Test type. pass 'ftp://ftp.example.com/some/path;type=A'; pass 'ftp://ftp.example.com/some/path;type=i'; pass 'ftp://abigail@ftp.example.com/some/path/somewhere;type=a', pass 'ftp://ftp.example.com/some/path;type=D'; fail 'ftp://ftp.example.com/some/path;type=AI'; pass 'ftp://ftp.example.com/;type=I'; # Scheme must be lower case, and correct. fail 'HTTP://ftp.example.com/'; fail 'FTP://ftp.example.com/'; fail 'http://ftp.example.com/'; Regexp-Common-2016020301/t/URI/gopher.t000755 000765 000024 00000007014 12116413566 017712 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; sub create_parts; my $gopher = $RE {URI} {gopher}; my $gopher_notab = $RE {URI} {gopher} {-notab}; # No point in crosschecking, URI creation is tag independent. my @tests = ( [gopher => $gopher => {gopher => NORMAL_PASS | FAIL}], [gopher_notab => $gopher_notab => {gopher_notab => NORMAL_PASS | FAIL}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI::gopher", tests => \@tests, good => $good, bad => $bad, query => \&gopher, wanted => \&wanted, filter => \&filter, ; sub gopher { my ($tag, $host, $port, $gtype, $selector, $search, $gopherplus_string) = ($_ [0], @{$_ [1]}); my $gopher = "gopher://"; $gopher .= $host if defined $host; $gopher .= ":$port" if defined $port; $gopher .= "/$gtype" if defined $gtype; $gopher .= $selector if defined $selector; $gopher .= "%09$search" if defined $search; $gopher .= "%09$gopherplus_string" if defined $gopherplus_string; $gopher; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = "gopher"; $wanted [2] = $$parts [0]; # host. $wanted [3] = $$parts [1]; # port. $wanted [4] = join "" => grep {defined} @$parts [2, 3]; $wanted [4] .= "%09" . $$parts [4] if defined $$parts [4]; $wanted [4] .= "%09" . $$parts [5] if defined $$parts [5]; $wanted [5] = $$parts [2]; # gtype. if ($tag eq 'gopher_notab') { $wanted [6] = $$parts [3]; # selector. $wanted [7] = $$parts [4]; # search. $wanted [8] = $$parts [5]; # gopherplus_string. } else { $wanted [6] = join "%09" => grep {defined} @$parts [3, 4, 5]; } \@wanted; } sub create_parts { my (@good, @bad); local $^W = 0; # Hosts. # Host/ports are tested with other URIs as well, we're not using # all the combinations here. $good [0] = [qw /www.abigail.be 127.0.0.1 w--w--w3.ABIGAIL.nl/]; $bad [0] = [qw /www.example..com w+w.example.com 127.0.0.0.1/]; # Ports. $good [1] = [undef, 70]; $bad [1] = ["", qw /: port/]; # Gtype # No need for both "" and 'undef' in the bad part here - they will # result in the same URI. $good [2] = [qw /0 + T/]; $bad [2] = ["", qw /~/]; # Selector # Don't use an 'undef' here. It will create the same URI as for # the empty string, but {-keep} will return "". $good [3] = ["", qw {FNURD 0}, q {$_.+!*'(),:@&=%FF}]; pop @{$good [3]} if $] < 5.006; # For speed. $bad [3] = [qw {/ []}]; # Search $good [4] = [undef, "", qw {FNORD 0}, q {$_.+!*'(),:@&=%FF}]; $bad [4] = [qw {/ []}]; # Gopherplus string $good [5] = [undef, "", qw {fnord 0}, q {$_.+%09!*'(),:@&=%FF}]; $bad [5] = [qw {/ []}]; return (\@good, \@bad); } sub filter { # Disallow defined gopherplus strings if search is undefined. return 0 if defined $_ [0] -> [5] && !defined $_ [0] -> [4]; # If the gtype is "", but the selector starts with a char that's # a valid gtype, we'll see a match where we'd expect a failure. return 0 if $_ [0] -> [2] eq "" && defined $_ [0] -> [3] && $_ [0] -> [3] =~ /^[0-9+IgT]/; return 1; } __END__ Regexp-Common-2016020301/t/URI/http.t000755 000765 000024 00000005027 12116413566 017407 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; sub create_parts; my $http = $RE {URI} {HTTP}; my $https = $http -> {-scheme => 'https'}; my $any = $http -> {-scheme => 'https?'}; my @tests = ( [http => $http => {http => NORMAL_PASS | FAIL, https => NORMAL_FAIL}], [https => $https => {http => NORMAL_FAIL, https => NORMAL_PASS | FAIL}], [any => $any => {http => NORMAL_PASS, https => NORMAL_PASS}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI", tests => \@tests, good => $good, bad => $bad, query => \&uri, wanted => \&wanted, filter => \&filter; sub uri { my ($scheme, $host, $port, $path, $query) = ($_ [0], @{$_ [1]}); my $uri = "$scheme://$host"; $uri .= ":$port" if defined $port; $uri .= "/$path" if defined $path; $uri .= "?$query" if defined $query && defined $path; $uri; } sub wanted { my ($scheme, $parts) = @_; my $abs = $parts -> [2]; $abs .= "?$parts->[3]" if defined $abs && defined $parts -> [3]; my @wanted; $wanted [0] = $_; $wanted [1] = $scheme; $wanted [2] = $parts -> [0]; $wanted [3] = $parts -> [1]; $wanted [4] = "/$abs" if defined $abs; $wanted [5] = $abs if defined $abs; $wanted [6] = $parts -> [2]; $wanted [7] = undef; $wanted [7] = $parts -> [3] if defined $parts -> [2]; \@wanted; } sub create_parts { my (@good, @bad); # Hosts. $good [0] = [qw /www.abigail.be www.PERL.com a.b.c.d.e.f.g.h.i.j.k.x 127.0.0.1 w--w--w.abigail.be w3.abigail.be/]; $bad [0] = ["", qw /www.example..com w+w.example.com w--.example.com 127.0.1 127.0.0.0.1 -w.example.com www.example.1com/]; # Ports. $good [1] = [undef, "", 80]; $bad [1] = [qw /-19 : port/]; # Paths. $good [2] = [undef, "", qw {foo foo/bar/baz/bingo foo%00bar foo%EFbar %12%34%E6%7B %12%34/%E6%7B %12%34%E6%7B/foo ()() fnurd&.!~@}]; $bad [2] = [qw {foo<> foo<>bar %GGfoo foo%F %FOfoo}, '#hubba']; # Queries. $good [3] = [undef, "", qw {hubba fnurd=many&woozle=yes %3E%FF barra?femmy??dopey}]; $bad [3] = ['query#', '#query', 'qu#ry']; return (\@good, \@bad); } sub filter { return !defined $_ [0] -> [2] && defined $_ [0] -> [3] ? 0 : 1 } __END__ Regexp-Common-2016020301/t/URI/news.t000755 000765 000024 00000003755 12116413566 017412 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; sub create_parts; my $news = $RE {URI} {news}; my @tests = ( [news => $news => {news => NORMAL_PASS | FAIL}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI::news", tests => \@tests, good => $good, bad => $bad, query => \&news, wanted => \&wanted; sub news { my ($tag, $grouppart) = ($_ [0], @{$_ [1]}); my $news = "news:"; $news .= $grouppart if defined $grouppart; $news; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = "news"; $wanted [2] = $$parts [0]; \@wanted; } sub create_parts { my (@good, @bad); my @good_arts = qw {fnord banzai123 4567 000 (!!make-$$$-fast**) %00%FF%12''' really? ?/?/?/&=:;}; my @good_hosts = qw /www.abigail.be www.PERL.com a.b.c.d.e.f.g.h.i.j.k.x 127.0.0.1 w--w--w.abigail.be w3.abigail.be/; my @bad_arts = ("", qw /%GG %F %7- %% {} <> ~abigail []/); my @bad_hosts = ("", qw /www.example..com w+w.example.com w--.example.com 127.0.1 127.0.0.0.1 -w.example.com www.example.1com/); # Groupparts. $good [0] = ["*", qw /a comp.lang.perl.misc comp.lang.c++ hello_kitty_ foo-1234567890/, map {join '@' => @$_} t::Common::cross (\@good_arts, \@good_hosts)]; $bad [0] = ["", qw /1234567890 ** really? (!!make-$$$-fast**) _hello_kitty_/, (map {join '@' => @$_} t::Common::cross (\@good_arts, \@bad_hosts)), (map {join '@' => @$_} t::Common::cross (\@bad_arts, \@good_hosts))]; return (\@good, \@bad); } __END__ Regexp-Common-2016020301/t/URI/nntp.t000755 000765 000024 00000004130 12116413566 017401 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; sub create_parts; my $nntp = $RE {URI} {NNTP}; my @tests = ( [nntp => $nntp => {nntp => NORMAL_PASS | FAIL}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI::news", tests => \@tests, good => $good, bad => $bad, query => \&nntp, wanted => \&wanted; sub nntp { my ($tag, $host, $port, $group, $digits) = ($_ [0], @{$_ [1]}); my $nntp = "nntp://"; $nntp .= $host if defined $host; $nntp .= ":$port" if defined $port; $nntp .= "/$group" if defined $group; $nntp .= "/$digits" if defined $digits; $nntp; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = "nntp"; $wanted [2] = join "/" => grep {defined} join (":" => grep {defined} @$parts [0, 1]), @$parts [2, 3]; $wanted [3] = join ":" => grep {defined} @$parts [0, 1]; $wanted [4] = $$parts [0]; $wanted [5] = $$parts [1]; $wanted [6] = $$parts [2]; $wanted [7] = $$parts [3]; \@wanted; } sub create_parts { my (@good, @bad); # Hosts. $good [0] = [qw /www.abigail.be www.PERL.com a.b.c.d.e.f.g.h.i.j.k.x 127.0.0.1 w--w--w.abigail.be w3.abigail.be/]; $bad [0] = [qw /www.example..com w+w.example.com w--.example.com 127.0.0.0.1 -w.example.com www.example.1com/]; # Ports. $good [1] = [undef, 119]; $bad [2] = ["", qw /-19 : port/]; # Group. $good [2] = [qw /a comp.lang.perl.misc comp.lang.c++ hello_kitty_ foo-1234567890/]; $bad [2] = [undef, "", qw /1234567890 ** really? (!!make-$$$-fast**) _hello_kitty_/]; # Digits. $good [3] = [undef, qw /0 0000 12345/]; $bad [3] = ["", qw /fnurd -19 */, "1234/", "/12", "/"]; return (\@good, \@bad); } __END__ Regexp-Common-2016020301/t/URI/pop.t000755 000765 000024 00000004256 12116413566 017231 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; $DEBUG = 1; sub create_parts; my $scheme = 'pop'; my $pop = $RE {URI} {uc $scheme}; # No point in crosschecking, URI creation is tag independent. my @tests = ( [pop => $pop => {pop => NORMAL_PASS | FAIL}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI::$scheme", tests => \@tests, good => $good, bad => $bad, query => \&query, wanted => \&wanted, filter => \&filter, ; sub query { my ($tag, $user, $auth_type, $host, $port) = ($_ [0], @{$_ [1]}); my $url = "$scheme://"; if (defined $user) { $url .= $user; $url .= ";AUTH=$auth_type" if defined $auth_type; $url .= '@'; } $url .= $host if defined $host; $url .= ":$port" if defined $port; $url; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = "$scheme"; $wanted [2] = $$parts [0]; # user. $wanted [3] = $$parts [1]; # auth. $wanted [4] = $$parts [2]; # host. $wanted [5] = $$parts [3]; # port. \@wanted; } sub create_parts { my (@good, @bad); # Users $good [0] = [undef, qw /abigail/]; push @{$good [0]} => qw /abigail%20&%20a%20camel=/ unless $] < 5.006; $bad [0] = ["", qw /abigail%GG [abigail]/]; # Auth_type $good [1] = [undef, qw /* password &~=~& +APOP +password +/]; $bad [1] = ["", qw /"password" camel-][/]; # Hosts. $good [2] = [qw /pop3.abigail.be pop3.PERL.com 127.0.0.1/]; push @{$good [2]} => qw /a.b.c.d.e.f.g.h.i.j.k.x p--p--p.abigail.be/ unless $] < 5.006; # Speed. $bad [2] = [qw /www.example..com w+w.example.com 127.0.0.0.1 w--.example.com -w.example.com www.example.1com/]; # Ports. $good [3] = [undef, 110]; $bad [3] = ["", qw /: port/]; return (\@good, \@bad); } sub filter { return 0 if defined ${$_ [0]} [1] && !defined ${$_ [0]} [0]; return 1; } __END__ Regexp-Common-2016020301/t/URI/prospero.t000755 000765 000024 00000005567 12116413566 020312 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; $DEBUG = 1; sub create_parts; my $prospero = $RE {URI} {prospero}; # No point in crosschecking, URI creation is tag independent. my @tests = ( [prospero => $prospero => {prospero => NORMAL_PASS | FAIL}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI::prospero", tests => \@tests, good => $good, bad => $bad, query => \&prospero, wanted => \&wanted, filter => \&filter, ; sub prospero { my ($tag, $host, $port, $ppath, $fieldnames, $fieldvalues) = ($_ [0], @{$_ [1]}); my $prospero = "prospero://"; $prospero .= $host if defined $host; $prospero .= ":$port" if defined $port; $prospero .= "/$ppath" if defined $ppath; if (defined $fieldnames) { foreach my $i (0 .. $#$fieldnames) { $prospero .= ";$fieldnames->[$i]"; $prospero .= "=$fieldvalues->[$i]" if defined $fieldvalues -> [$i]; } } $prospero; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = "prospero"; $wanted [2] = $$parts [0]; # host. $wanted [3] = $$parts [1]; # port. $wanted [4] = $$parts [2]; # ppart. $wanted [5] = ""; if (defined $$parts [3]) { foreach my $i (0 .. $#{$$parts [3]}) { $wanted [5] .= ";${$$parts [3]}[$i]=${$$parts [4]}[$i]"; } } \@wanted; } sub create_parts { my (@good, @bad); # Hosts. $good [0] = [qw /www.abigail.be www.PERL.com a.b.c.d.e.f.g.h.i.j.k.x 127.0.0.1 w--w--w.abigail.be w3.abigail.be/]; $bad [0] = [qw /www.example..com w+w.example.com w--.example.com 127.0.0.0.1 -w.example.com www.example.1com/]; # Ports. $good [1] = [undef, 1525]; $bad [1] = ["", qw /: port/]; # Ppart $good [2] = ["", qw {part foo/bar fnord:&=?%FF}]; $bad [2] = [undef, qw {~}, ' ']; # Fieldname $good [3] = [undef, [qw /name/], [qw /name1 name2/], [""], ["", ""], ["", qw /name/], [qw /fnord:&?%FF/]]; $bad [3] = [[qw /name==/], ['~']]; # Fieldvalue $good [4] = [undef, [qw /value/], [qw /value1 value2/], [""], ["", ""], ["", qw /value/], [qw /fnord:&?%FF/]]; $bad [4] = [[qw /value==/], ['~'], [undef], [undef, undef]]; return (\@good, \@bad); } sub filter { return 1 if !defined ${$_ [0]} [3] && !defined ${$_ [0]} [4]; return 0 if defined ${$_ [0]} [3] && !defined ${$_ [0]} [4] || !defined ${$_ [0]} [3] && defined ${$_ [0]} [4]; return 0 if @{${$_ [0]} [3]} != @{${$_ [0]} [4]}; return 1; } __END__ Regexp-Common-2016020301/t/URI/tel.t000755 000765 000024 00000005242 12116413566 017213 0ustar00abigailstaff000000 000000 # VOODOO LINE-NOISE my($C,$M,$P,$N,$S);END{print"1..$C\n$M";print"\nfailed: $N\n"if$N} sub ok{$C++; $M.= ($_[0]||!@_)?"ok $C\n":($N++,"not ok $C (". ((caller 1)[1]||(caller 0)[1]).":".((caller 1)[2]||(caller 0)[2]).")\n")} sub try{$P=qr/^$_[0]$/}sub fail{ok($S=$_[0]!~$P)}sub pass{ok($S=$_[0]=~$P)} # LOAD use Regexp::Common; ok; # TEST URIs try $RE{URI}{tel}; pass 'tel:+12345'; pass 'tel:+358-555-1234567'; pass 'tel:456-7890;phone-context=213'; pass 'tel:456-7890;phone-context=X-COMPANY-NET'; pass 'tel:+1-212-555-1234;tsp=terrifictelecom.com'; pass 'tel:+1-212-555-1234;tsp=terrifictelecom.com;phone-context=X-COMPANY-NET'; pass 'tel:+358-555-1234567;postd=pp22'; pass 'tel:0w003585551234567;phone-context=+3585551234'; pass 'tel:+1234567890;phone-context=+1234;vnd.company.option=foo'; pass 'tel:+1234567890;phone-context=+1234;vnd.company.option=%22foo%22'; pass 'tel:+1234;option=%22!%22'; pass 'tel:+1234;option=%22%5C%22%22'; pass 'tel:+1234;option=%22%5C!%22'; pass 'tel:+1234;option=%22bar%22'; pass 'tel:+456-7890;phone-context=213;phone-context=213'; pass 'tel:456-7890;phone-context=213;phone-context=213'; fail 'tel:456-7890'; fail 'tel:+1-800-RUN-PERL'; fail 'tel:+1234;option=%22%22%22'; fail 'tel:+1234;option=%22%5C%22'; pass 'tel:+123-456-789;isub=123(456)'; pass 'tel:+123456;postd=***'; try $RE{URI}{tel}{nofuture}; pass 'tel:+12345'; pass 'tel:+358-555-1234567'; pass 'tel:456-7890;phone-context=213'; pass 'tel:456-7890;phone-context=X-COMPANY-NET'; pass 'tel:+1-212-555-1234;tsp=terrifictelecom.com'; pass 'tel:+1-212-555-1234;tsp=terrifictelecom.com;phone-context=X-COMPANY-NET'; pass 'tel:+358-555-1234567;postd=pp22'; pass 'tel:0w003585551234567;phone-context=+3585551234'; fail 'tel:+1234567890;phone-context=+1234;vnd.company.option=foo'; fail 'tel:+1234567890;phone-context=+1234;vnd.company.option=%22foo%22'; fail 'tel:+1234;option=%22!%22'; fail 'tel:+1234;option=%22%5C%22%22'; fail 'tel:+1234;option=%22%5C!%22'; fail 'tel:+1234;option=%22bar%22'; pass 'tel:+456-7890;phone-context=213;phone-context=213'; pass 'tel:456-7890;phone-context=213;phone-context=213'; fail 'tel:456-7890'; fail 'tel:+1-800-RUN-PERL'; fail 'tel:+1234;option=%22%22%22'; fail 'tel:+1234;option=%22%5C%22'; fail 'tel:+358-555-1234567;phone-context=+1234;postd=pp22'; pass 'tel:+123-456-789;isub=123(456)'; fail 'tel:+123-456-789;isub=123(456);isub=123(456)'; fail 'tel:+123-456-789;isub=A23(456)'; pass 'tel:+123456;postd=***'; fail 'tel:1234567890;phone-context=+1234;vnd.company.option=foo'; fail 'tel:1234567890;phone-context=+1234;vnd.company.option=%22foo%22'; fail 'tel:1234;option=%22!%22'; fail 'tel:1234;option=%22%5C%22%22'; fail 'tel:1234;option=%22%5C!%22'; fail 'tel:1234;option=%22bar%22'; Regexp-Common-2016020301/t/URI/telnet.t000755 000765 000024 00000004651 12116413566 017725 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; sub create_parts; my $telnet = $RE {URI} {telnet}; my @tests = ( [telnet => $telnet => {telnet => NORMAL_PASS | FAIL}] ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI", tests => \@tests, good => $good, bad => $bad, query => \&uri, wanted => \&wanted, filter => \&filter; sub uri { my ($scheme, $user, $password, $host, $port, $slash) = ($_ [0], @{$_ [1]}); my $uri = "$scheme://"; $uri .= $user if defined $user; $uri .= ":$password" if defined $user && defined $password; $uri .= '@' if defined $user; $uri .= $host; $uri .= ":$port" if defined $port; $uri .= $slash if defined $slash; $uri; } sub wanted { my ($scheme, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = $scheme; if (defined $$parts [0]) { $wanted [2] = $$parts [0]; $wanted [3] = $$parts [0]; if (defined $$parts [1]) { $wanted [2] .= ":$$parts[1]"; $wanted [4] = $$parts [1]; } } $wanted [5] = $$parts [2]; $wanted [6] = $$parts [2]; if (defined $$parts [3]) { $wanted [5] .= ":$$parts[3]"; $wanted [7] = $$parts [3]; } $wanted [8] = undef; $wanted [8] = "/" if $$parts [4]; \@wanted; } sub create_parts { my (@good, @bad); # Users. $good [0] = [undef, "", qw /abigail ab?ga?l; abi%67ai%6C/]; $bad [0] = [qw /abigail-][/]; # Passwords. $good [1] = [undef, "", qw /secret se??et se%FFret/]; $bad [1] = [qw /se{}cret/]; # Hosts. $good [2] = [qw /www.abigail.be www.PERL.com 127.0.0.1 w3.abigail.be/]; push @{$good [2]} => qw /a.b.c.d.e.f.g.h.i.j.k.x w--w--w.abigail.be/ unless $] < 5.006; $bad [2] = [qw /www.example..com w+w.example.com w--.example.com 127.0.0.0.1 -w.example.com www.example.1com/]; # Ports. $good [3] = [undef, 123]; $bad [3] = ["", qw /-19 : port/]; # Trailing /. $good [4] = [undef, '/']; $bad [4] = ['//', '/foo', '@']; (\@good, \@bad); } sub filter { return !defined $_ [0] -> [0] && defined $_ [0] -> [1] ? 0 : 1 } __END__ Regexp-Common-2016020301/t/URI/tv.t000755 000765 000024 00000002245 12116413566 017060 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; $^W = 1; my $count; my $tv = qr /^$RE{URI}{tv}$/; my $keep = qr /^$RE{URI}{tv}{-keep}$/; sub mess {print ++ $count, " - $_ (@_)\n"} sub pass {print "ok "; &mess} sub fail {print "not ok "; &mess} my (@hosts, @failures); while () { chomp; last unless /\S/; push @hosts => $_; } push @hosts => ""; while () { chomp; last unless /\S/; push @failures => $_; } my $max = 1 + 2 * @hosts + @failures; print "1..$max\n"; print "ok ", ++ $count, "\n"; # print "$fail\n"; exit; foreach my $host (@hosts) { local $_ = "tv:$host"; /$tv/ ? pass "match" : fail "no match"; /$keep/ ? $1 eq $_ && $2 eq "tv" && (length $host ? $3 eq $host : !defined $3) ? pass "match; keep" : fail "match ($1, $2, $3); keep" : fail "no match; keep" } foreach (@failures) { /$tv/ ? fail "match" : pass "no match"; } __DATA__ wqed.com nbc.com abc.com abc.co.au east.hbo.com west.hbo.com bbc.co.uk TV:abc.com abc.com http:abc.com tv://abc.com tv:abc..com tv:.abc.com tv:abc-.com tv:-abc.com Regexp-Common-2016020301/t/URI/wais.t000755 000765 000024 00000006377 12116413566 017404 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; $^W = 1; sub create_parts; my $wais = $RE {URI} {WAIS}; # No point in crosschecking, URI creation is tag independent. my @tests = ( [wais => $wais => {wais => NORMAL_PASS | FAIL}], ); my ($good, $bad) = create_parts; run_tests version => "Regexp::Common::URI::wais", tests => \@tests, good => $good, bad => $bad, query => \&wais, wanted => \&wanted, filter => \&filter, filter_passes => \&filter_passes, ; sub wais { my ($tag, $host, $port, $database, $search, $wtype, $wpath) = ($_ [0], @{$_ [1]}); my $wais = "wais://"; $wais .= $host if defined $host; $wais .= ":$port" if defined $port; $wais .= "/$database" if defined $database; $wais .= "?$search" if defined $search; $wais .= "/$wtype" if defined $wtype; $wais .= "/$wpath" if defined $wpath; $wais; } sub wanted { my ($tag, $parts) = @_; my @wanted; $wanted [0] = $_; $wanted [1] = "wais"; $wanted [2] = $$parts [0]; # host. $wanted [3] = $$parts [1]; # port. $wanted [4] = $$parts [2]; # database. $wanted [4] .= "?" . $$parts [3] if defined $$parts [3]; $wanted [4] .= "/" . $$parts [4] if defined $$parts [4]; $wanted [4] .= "/" . $$parts [5] if defined $$parts [5]; $wanted [5] = $$parts [2]; # database. $wanted [6] = undef; $wanted [6] .= "?" . $$parts [3] if defined $$parts [3]; $wanted [6] .= "/" . $$parts [4] if defined $$parts [4]; $wanted [6] .= "/" . $$parts [5] if defined $$parts [5]; $wanted [7] = $$parts [3]; # search. $wanted [8] = $$parts [4]; # wtype. $wanted [9] = $$parts [5]; # wpath. \@wanted; } sub create_parts { my (@good, @bad); # Hosts. # Host/ports are tested with other URIs as well, we're not using # all the combinations here. $good [0] = [qw /www.abigail.be 127.0.0.1 w--w--w3.ABIGAIL.nl/]; $bad [0] = [qw /www.example..com w+w.example.com 127.0.0.0.1/]; # Ports. $good [1] = [undef, 210]; $bad [1] = ["", qw /: port/]; # Database $good [2] = ["", qw /database 0/, '%00%FF-!*,']; $bad [2] = [undef, qw /~/]; # Search $good [3] = [undef, "", qw /database 0/, '%00%FF-!*,']; $bad [3] = [qw {~ []}]; # Wtype $good [4] = [undef, "", qw /wtype 0/, '%00%FF-!*,']; $bad [4] = [qw {~ []}]; # Wpath $good [5] = [undef, "", qw /wpath 0/, '%00%FF-!*,']; $bad [5] = [qw {~ []}]; return (\@good, \@bad); } sub filter_passes { # Good URIs have either both a wtype and a wpath, or none at all. return 0 if defined $_ [0] -> [4] xor defined $_ [0] -> [5]; return 1; } sub filter { # At most one of 'search' and 'wtype/wpath' should be defined. return 0 if defined $_ [0] -> [3] && (defined $_ [0] -> [4] || defined $_ [0] -> [5]); return 0 if !defined $_ [0] -> [2] && grep {defined} @{$_ [0]} [3 .. 5]; return 1; } __END__ Regexp-Common-2016020301/t/SEN/usa_ssn.t000755 000765 000024 00000004124 12116413566 020066 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common qw /run_new_tests cross criss_cross dd pdd/; $^W = 1; my $ssn = $RE {SEN} {USA} {SSN}; my $space = $ssn -> {-sep => ' '}; my $empty = $ssn -> {-sep => ''}; use constant PASS => 4; use constant FAIL => 3; my $areas = [ "001", map {pdd 3} 1 .. PASS]; my $groups = [ "01", map {pdd 2} 1 .. PASS]; my $serials = ["0001", map {pdd 4} 1 .. PASS]; my $bad_a = [ "000", "", dd (1), dd (2), dd (4), dd (5, 10)]; my $bad_g = [ "00", "", dd (1), dd (3), dd (4), dd (5, 10)]; my $bad_s = ["0000", "", dd (1), dd (2), dd (3), dd (5, 10)]; my $ssns = [cross $areas, $groups, $serials]; my $wrong = [criss_cross [[@$areas [0 .. FAIL - 1]], [@$groups [0 .. FAIL - 1]], [@$serials [0 .. FAIL - 1]]], [$bad_a, $bad_g, $bad_s]]; my %targets = ( ssn => { list => $ssns, query => sub {join "-" => @_}, wanted => sub {$_ => @_}, }, space => { list => $ssns, query => sub {join " " => @_}, wanted => sub {$_ => @_}, }, empty => { list => $ssns, query => sub {join "" => @_}, wanted => sub {$_ => @_}, }, wrong1 => { list => $wrong, query => sub {join "-" => @_}, }, wrong2 => { list => $wrong, query => sub {join " " => @_}, }, ); my @wrongs = qw /wrong1 wrong2/; my @tests = ( { name => 'basic', regex => $ssn, pass => [qw /ssn/], fail => [qw /empty space/, @wrongs], }, { name => 'space', regex => $space, pass => [qw /space/], fail => [qw /empty ssn/, @wrongs], }, { name => 'empty', regex => $empty, pass => [qw /empty/], fail => [qw /ssn space/, @wrongs], }, ); run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::SEN', ; __END__ Regexp-Common-2016020301/t/number/101_integer.t000755 000765 000024 00000014256 12643716764 021316 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use warnings; no warnings 'syntax'; use Regexp::Common; use Test::More; my $r = eval "require Test::Regexp; 1"; unless ($r) { print "1..0 # SKIP Test::Regexp not found\n"; exit; } sub make_test { my ($name, $base, @options) = @_; my $pat = $base; while (@options) { my $opt = shift @options; if (@options && $options [0] !~ /^-/) { my $val = shift @options; $pat = $$pat {$opt => $val}; $name .= ", $opt => $val"; } else { $pat = $$pat {$opt}; $name .= ", $opt"; } } my $keep = $$pat {-keep}; Test::Regexp:: -> new -> init ( pattern => $pat, keep_pattern => $keep, name => $name, ); } # # Some basic patterns: plain integers, signed integers, unsigned integers. # my $integer = make_test "Integer pattern" => $RE {num} {int}; my $signed_integer = make_test "Integer pattern" => $RE {num} {int}, -sign => '[-+]'; my $unsigned_integer = make_test "Integer pattern" => $RE {num} {int}, -sign => ''; foreach my $digit (0 .. 9) { $integer -> match ( $digit, [ $digit, "", $digit], test => "Single digit"); $integer -> match ("-$digit", ["-$digit", "-", $digit], test => "Single digit, with minus sign"); $integer -> match ("+$digit", ["+$digit", "+", $digit], test => "Single digit, with plus sign"); $signed_integer -> no_match ($digit, reason => "No sign"); $signed_integer -> match ("-$digit", ["-$digit", "-", $digit], test => "Single digit, with minus sign"); $signed_integer -> match ("+$digit", ["+$digit", "+", $digit], test => "Single digit, with plus sign"); $unsigned_integer -> match ( $digit, [ $digit, "", $digit], test => "Single digit"); $unsigned_integer -> no_match ("-$digit", reason => "Number has - sign"); $unsigned_integer -> no_match ("+$digit", reason => "Number has + sign"); } foreach my $letter ('A' .. 'Z') { my $reason = "Cannot match letters"; $integer -> no_match ( $letter, reason => $reason); $integer -> no_match ("-$letter", reason => $reason); $integer -> no_match ("+$letter", reason => $reason); $signed_integer -> no_match ( $letter, reason => $reason); $signed_integer -> no_match ("-$letter", reason => $reason); $signed_integer -> no_match ("+$letter", reason => $reason); $unsigned_integer -> no_match ( $letter, reason => $reason); $unsigned_integer -> no_match ("-$letter", reason => $reason); $unsigned_integer -> no_match ("+$letter", reason => $reason); } my @numbers = qw [ 123456789 987654321 00 00000 918710985710984523480938457287510917634178356017501984571273461782346 2109381270129857102931405984051817410923193913810985 123981098509850493582357010910371947524594785923602871749187249504395 000000000000000000000000000000000000000000000000000000000000000000000001 12890991823457 09857109247120 0000000090000000000000009000000000000000 ]; my @big_numbers = ( '123456789' x 100, '0' x 10_000, ); my @failures = ( [" 12345" => "Leading space"], ["123 " => "Trailing space"], ["-+1234" => "Double sign"], ["--54311" => "Double sign"], ["- 897" => "Space after sign"], ["" => "Empty string"], ["-" => "Sign only"], ["1234 678" => "Space in number"], ["1234+678" => "Sign in number"], ["678A90" => "Letter in number"], ["0x1234" => "Hex number"], ["0b1234" => "Octal number"], ["Bla bla" => "Garbage"], ); foreach my $number (@numbers) { $integer -> match ( $number , [ $number , "", $number], test => "Unsigned number"); $integer -> match ("-$number", ["-$number", "-", $number], test => "Number with minus sign"); $integer -> match ("+$number", ["+$number", "+", $number], test => "Number with plus sign"); $signed_integer -> no_match ($number, reason => "Number is unsigned"); $signed_integer -> match ("-$number", ["-$number", "-", $number], test => "Number with minus sign"); $signed_integer -> match ("+$number", ["+$number", "+", $number], test => "Number with plus sign"); $unsigned_integer -> match ( $number , [ $number , "", $number], test => "Unsigned number"); $unsigned_integer -> no_match ("-$number", reason => "Number has - sign"); $unsigned_integer -> no_match ("+$number", reason => "Number has + sign"); } foreach my $number (@big_numbers) { $integer -> match ( $number , [ $number , "", $number], test => "Unsigned big number"); $integer -> match ("-$number", ["-$number", "-", $number], test => "Big number with minus sign"); $integer -> match ("+$number", ["+$number", "+", $number], test => "Big number with plus sign"); $signed_integer -> no_match ($number, reason => "Number is unsigned"); $signed_integer -> match ("-$number", ["-$number", "-", $number], test => "Big number with minus sign"); $signed_integer -> match ("+$number", ["+$number", "+", $number], test => "Big number with plus sign"); $unsigned_integer -> match ( $number , [ $number , "", $number], test => "Unsigned big number"); $unsigned_integer -> no_match ("-$number", reason => "Number has - sign"); $unsigned_integer -> no_match ("+$number", reason => "Number has + sign"); } foreach my $failure (@failures) { my ($subject, $reason) = @$failure; $integer -> no_match ($subject, reason => $reason); $signed_integer -> no_match ($subject, reason => $reason); } done_testing (); __END__ Regexp-Common-2016020301/t/number/111_integer_base.t000755 000765 000024 00000012501 12643716764 022300 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use warnings; no warnings 'syntax'; use Regexp::Common; use Test::More; my $r = eval "require Test::Regexp; 1"; unless ($r) { print "1..0 # SKIP Test::Regexp not found\n"; exit; } sub make_test { my ($name, $base, @options) = @_; my $pat = $base; while (@options) { my $opt = shift @options; if (@options && $options [0] !~ /^-/) { my $val = shift @options; $pat = $$pat {$opt => $val}; $name .= ", $opt => $val"; } else { $pat = $$pat {$opt}; $name .= ", $opt"; } } my $keep = $$pat {-keep}; Test::Regexp:: -> new -> init ( pattern => $pat, keep_pattern => $keep, name => $name, ); } # # Play with -base. # my @bases = (2, 5, 8, 16, 23, 36); my %patterns; my $PLAIN = 0; my $SIGNED = 1; foreach my $base (@bases) { my $plain = make_test "Base $base integer pattern", $RE {num} {int}, -base => $base; my $signed = make_test "Base $base integer pattern", $RE {num} {int}, -base => $base, -sign => '[-+]'; $patterns {$base} = [$plain, $signed]; } my @chars = (0 .. 9, 'A' .. 'Z'); for (my $i = 0; $i < @chars; $i ++) { my $char = $chars [$i]; foreach my $base (@bases) { if ($i < $base) { my $subj = $char; $patterns {$base} [$PLAIN] -> match ( $subj, [$subj, "", $char], test => "Single character in base" ); $patterns {$base} [$SIGNED] -> no_match ( $subj, reason => "Not signed" ); $subj = "-$char"; $patterns {$base} [$PLAIN] -> match ( $subj, [$subj, "-", $char], test => "Signed (-) single character in base", ); $patterns {$base} [$SIGNED] -> match ( $subj, [$subj, "-", $char], test => "Signed (-) single character in base", ); $subj = "+$char"; $patterns {$base} [$PLAIN] -> match ( $subj, [$subj, "+", $char], test => "Signed (+) single character in base", ); $patterns {$base} [$SIGNED] -> match ( $subj, [$subj, "+", $char], test => "Signed (+) single character in base", ); } else { for my $subj ($char, "-$char", "+$char") { $patterns {$base} [$PLAIN] -> no_match ( $subj, reason => "Character '$char' out of base $base" ) } } } } my $numbers = [ '00', '00000000000000000000', '0' x 100_000, '11', '10101010101010101010', '11010100110100100111', '12234', '4321', '1234' x 10_000, '00000000000000000000000000000000000000000030000000000000', '444444444444444444444444444444444444', '1234567', '7654321', '12435147126123471234651263154211451235', '412377132477712347716234512374712341541', '2378AB21394CFF9932841EFFA9234', 'DEADBEEF', 'BABEFACE', '87134F13241', 'FEDCBA9876543210' x 1_000, 'ASTORYWELLTOLD', 'BETTERL8THENNEVER', '4SALE', 'AS08142H5A87SDFYOUY4YR09TWRE7YGUASDFA99Q0ASHNR1KF98QERTOQ2C871C123R', 'AL0NGSTR1NGR3P34T3DM4NYT1M3S' x 3_000 ]; my %numbers_by_base; NUMBER: foreach my $number (@$numbers) { my %buckets; $buckets {$_} ++ for split // => $number; for (my $i = @chars - 1; $i >= 0; $i --) { if ($buckets {$chars [$i]}) { push @{$numbers_by_base {$i + 1}} => $number; next NUMBER; } } } my @base_numbers = sort {$a <=> $b} keys %numbers_by_base; foreach my $base (@bases) { my ($pattern, $signed_pattern) = @{$patterns {$base}}; foreach my $base_number (@base_numbers) { foreach my $number (@{$numbers_by_base {$base_number}}) { my $is_big = length ($number) > 100; my $desc_number = $is_big ? "big number" : "number"; if ($base >= $base_number) { my $subj = $number; $pattern -> match ($subj, [$subj, "", $number], test => "Unsigned $desc_number"); $signed_pattern -> no_match ($subj, reason => "No sign for $desc_number"); $subj = "-$number"; $pattern -> match ($subj, [$subj, "-", $number], test => "Signed (-) $desc_number"); $signed_pattern -> match ($subj, [$subj, "-", $number], test => "Signed (-) $desc_number"); $subj = "+$number"; $pattern -> match ($subj, [$subj, "+", $number], test => "Signed (+) $desc_number"); $signed_pattern -> match ($subj, [$subj, "+", $number], test => "Signed (+) $desc_number"); } else { foreach my $subj ($number, "-$number", "+$number") { $pattern -> no_match ($subj, reason => "Out of base characters"); $signed_pattern -> no_match ($subj, reason => "Out of base characters"); } } } } } done_testing (); __END__ Regexp-Common-2016020301/t/number/121_integer_places.t000755 000765 000024 00000006430 12643716764 022642 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use warnings; no warnings 'syntax'; use Regexp::Common; use Test::More; my $r = eval "require Test::Regexp; 1"; unless ($r) { print "1..0 # SKIP Test::Regexp not found\n"; exit; } sub make_test { my ($name, $base, @options) = @_; my $pat = $base; while (@options) { my $opt = shift @options; if (@options && $options [0] !~ /^-/) { my $val = shift @options; $pat = $$pat {$opt => $val}; $name .= ", $opt => $val"; } else { $pat = $$pat {$opt}; $name .= ", $opt"; } } my $keep = $$pat {-keep}; Test::Regexp:: -> new -> init ( pattern => $pat, keep_pattern => $keep, name => $name, ); } # # Patterns with fixed places. # my @places = (1, 2, 3, 5, 8, 13, 21, 34); my %patterns; foreach my $places (@places) { my $places_pattern = make_test "Integer pattern", $RE {num} {int}, -places => $places; my $places_pattern_signed = make_test "Integer pattern", $RE {num} {int}, -places => $places, -sign => '[-+]'; $patterns {$places} = [$places_pattern, $places_pattern_signed]; } my @numbers; push @numbers => map {"0" x $_} 1 .. ($places [-1] + 1); push @numbers => qw [ 921092 1230981409 1239801 12034009123 120381409 12 098213470 289341728912098510298571873824712384 129834701 1098240 12349 3475 897465121 992342199123499195 999999999 12481 598134 23418 98214510814580 891274102981829570918 981243 1928411 912834 ]; foreach my $number (@numbers) { my $length = length $number; foreach my $places (@places) { my ($pattern, $signed_pattern) = @{$patterns {$places}}; if ($length < $places) { my $reason = "Number too short"; foreach my $subj ($number, "-$number") { $pattern -> no_match ($subj, reason => $reason); $signed_pattern -> no_match ($subj, reason => $reason); } } elsif ($length > $places) { my $reason = "Number too long"; foreach my $subj ($number, "+$number") { $pattern -> no_match ($subj, reason => $reason); $signed_pattern -> no_match ($subj, reason => $reason); } } else { $pattern -> match ($number, [$number, "", $number], test => "Exact length"); $signed_pattern -> no_match ($number, reason => "Number not signed"); $pattern -> match ("-$number", ["-$number", "-", $number], test => "Exact length, signed (-)"); $signed_pattern -> match ("-$number", ["-$number", "-", $number], test => "Exact length, signed (-)"); $pattern -> match ("+$number", ["+$number", "+", $number], test => "Exact length, signed (+)"); $signed_pattern -> match ("+$number", ["+$number", "+", $number], test => "Exact length, signed (+)"); } } } done_testing (); __END__ Regexp-Common-2016020301/t/number/122_integer_places.t000755 000765 000024 00000007241 12643716764 022644 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use warnings; no warnings 'syntax'; use Regexp::Common; use Test::More; my $r = eval "require Test::Regexp; 1"; unless ($r) { print "1..0 # SKIP Test::Regexp not found\n"; exit; } sub make_test { my ($name, $base, @options) = @_; my $pat = $base; while (@options) { my $opt = shift @options; if (@options && $options [0] !~ /^-/) { my $val = shift @options; $pat = $$pat {$opt => $val}; $name .= ", $opt => $val"; } else { $pat = $$pat {$opt}; $name .= ", $opt"; } } my $keep = $$pat {-keep}; Test::Regexp:: -> new -> init ( pattern => $pat, keep_pattern => $keep, name => $name, ); } # # Patterns with variable places. # my @places = (1, 3, 8, 21, 34); my %patterns; for (my $i = 0; $i < @places; $i ++) { my $places1 = $places [$i]; for (my $j = $i + 1; $j < @places; $j ++) { my $places2 = $places [$j]; my $places = "$places1,$places2"; my $places_pattern = make_test "Integer pattern" => $RE {num} {int}, -places => $places; my $places_pattern_signed = make_test "Integer pattern" => $RE {num} {int}, -places => $places, -sign => '[-+]'; $patterns {$places1} {$places2} = [$places_pattern, $places_pattern_signed]; } } my @numbers; push @numbers => map {"0" x $_} 1 .. ($places [-1] + 1); push @numbers => qw [ 921092 1230981409 1239801 12034009123 120381409 12 098213470 289341728912098510298571873824712384 129834701 1098240 12349 3475 897465121 992342199123499195 999999999 12481 598134 23418 98214510814580 891274102981829570918 981243 1928411 912834 ]; foreach my $number (@numbers) { my $length = length $number; for (my $i = 0; $i < @places; $i ++) { my $places1 = $places [$i]; for (my $j = $i + 1; $j < @places; $j ++) { my $places2 = $places [$j]; my ($pattern, $signed_pattern) = @{$patterns {$places1} {$places2}}; if ($length < $places1) { my $reason = "Number too short"; foreach my $subj ($number, "-$number") { $pattern -> no_match ($subj, reason => $reason); $signed_pattern -> no_match ($subj, reason => $reason); } } elsif ($length > $places2) { my $reason = "Number too long"; foreach my $subj ($number, "+$number") { $pattern -> no_match ($subj, reason => $reason); $signed_pattern -> no_match ($subj, reason => $reason); } } else { my $reason = "Length within bounds"; $pattern -> match ($number, [$number, "", $number], test => $reason); $signed_pattern -> no_match ($number, reason => "Number not signed"); $pattern -> match ("+$number", ["+$number", "+", $number], test => "$reason, signed (+)"); $signed_pattern -> match ("+$number", ["+$number", "+", $number], test => "$reason, signed (+)"); } } } } done_testing (); __END__ Regexp-Common-2016020301/t/number/123_integer_places.t000755 000765 000024 00000006413 12643716764 022645 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use warnings; no warnings 'syntax'; use Regexp::Common; use Test::More; my $r = eval "require Test::Regexp; 1"; unless ($r) { print "1..0 # SKIP Test::Regexp not found\n"; exit; } sub make_test { my ($name, $base, @options) = @_; my $pat = $base; while (@options) { my $opt = shift @options; if (@options && $options [0] !~ /^-/) { my $val = shift @options; $pat = $$pat {$opt => $val}; $name .= ", $opt => $val"; } else { $pat = $$pat {$opt}; $name .= ", $opt"; } } my $keep = $$pat {-keep}; Test::Regexp:: -> new -> init ( pattern => $pat, keep_pattern => $keep, name => $name, ); } # # Combine places and bases # my $min = 3; my $max = 6; my $pattern = make_test "Integer pattern" => $RE {num} {int}, -base => 4, -places => "$min,$max"; my $pattern_neg = make_test "Integer pattern" => $RE {num} {int}, -base => 4, -places => "$min,$max", -sign => '[-]'; my @numbers; push @numbers => map {"0" x $_} 1 .. 7; push @numbers => qw [ 1201201 21013 120 123100 3210310 1231231013 2130130 2130 31230 13012302 13130 ]; foreach my $number (@numbers) { my $length = length $number; if ($length < $min) { foreach my $subj ($number, "-$number", "+$number") { $pattern -> no_match ($number, reason => "Number too short"); $pattern_neg -> no_match ($number, reason => "Number too short"); } } elsif ($length > $max) { foreach my $subj ($number, "-$number", "+$number") { $pattern -> no_match ($number, reason => "Number too long"); $pattern_neg -> no_match ($number, reason => "Number too long"); } } else { $pattern -> match ($number, [$number, "", $number], test => "Number of correct length"); $pattern_neg -> no_match ($number, reason => "Number not signed"); $pattern -> match ("-$number", ["-$number", "-", $number], test => "Signed number of correct length"); $pattern_neg -> match ("-$number", ["-$number", "-", $number], test => "Signed number of correct length"); $pattern -> match ("+$number", ["+$number", "+", $number], test => "Signed number of correct length"); $pattern_neg -> no_match ($number, reason => "Number incorrectly signed"); } } my @bad_characters = ( ["Number contains space", "12 12", "111 1"], ["Digit exceeds base", "1234", "4", "121212124", "9123123123"], ["Letter in number", "123A", "Q", "202O20", "123Z21"], ); foreach my $entry (@bad_characters) { my ($reason, @subjs) = @$entry; foreach my $subj (@subjs) { $pattern -> no_match ($subj, reason => $reason); $pattern_neg -> no_match ($subj, reason => $reason); } } done_testing (); __END__ Regexp-Common-2016020301/t/number/131_integer_sep.t000755 000765 000024 00000006760 12643716764 022171 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use warnings; no warnings 'syntax'; use Regexp::Common; use Test::More; my $r = eval "require Test::Regexp; 1"; unless ($r) { print "1..0 # SKIP Test::Regexp not found\n"; exit; } sub make_test { my ($name, $base, @options) = @_; my $pat = $base; while (@options) { my $opt = shift @options; if (@options && $options [0] !~ /^-/) { my $val = shift @options; $pat = $$pat {$opt => $val}; $name .= ", $opt => $val"; } else { $pat = $$pat {$opt}; $name .= ", $opt"; } } my $keep = $$pat {-keep}; Test::Regexp:: -> new -> init ( pattern => $pat, keep_pattern => $keep, name => $name, ); } my $pattern_c = make_test "Integer pattern" => $RE {num} {int}, -sep => ","; my $pattern_u = make_test "Integer pattern" => $RE {num} {int}, -sep => "_"; my @pass_numbers = qw [ 0 00 000 123 45 6 123,456 78,901 2,345 0,000,000,000,000,000,000,000,000,000,000,000,000 00,000,000,000,000,000,000,000,000,000,000,000,000 000,000,000,000,000,000,000,000,000,000,000,000,000 5,098,145,984,398,345 2,831,471,982 38,247,113,284,912 7,312,834 8,732,123,847,132 45,884,573 99,234,759,141 27,348,134,581 214,357,191 ]; foreach my $number (@pass_numbers) { my $sep_c = $number =~ y/,/,/; my $test = $sep_c == 0 ? "No separator" : $sep_c == 1 ? "Single separator" : "Multiple separators"; $pattern_c -> match ( $number => [ $number, "", $number], test => $test); $pattern_c -> match ("-$number" => ["-$number", "-", $number], test => "$test, signed (-)"); $pattern_c -> match ("+$number" => ["+$number", "+", $number], test => "$test, signed (+)"); $number =~ s/,/_/g; $pattern_u -> match ( $number => [ $number, "", $number], test => $test); $pattern_u -> match ("-$number" => ["-$number", "-", $number], test => "$test, signed (-)"); $pattern_u -> match ("+$number" => ["+$number", "+", $number], test => "$test, signed (+)"); } my @failures = ( ["Wrong separator" => qw [0.000 1,234_456,789], "100 123"], ["Leading separator" => qw [,123 ,456,789]], ["Trailing separator" => qw [123, 456,789,]], ["Double separator" => qw [0,,000 123,456,,789]], ["No digits" => qw [, ,,]], ["Wrong number of digits in group" => qw [1,23,456 1,2345,678 489,1234,345,169,000]], ["Wrong number of digits in last group" => qw [123,4567 456,78]], ["Too many leading digits" => qw [1234,567 0000,000,000 8129132412341,000]], ["Trailing garbage" => qw [123,456,789foo 000,bar], "123,456 ", "987,543,611\n"], ["Leading garbage" => qw [baz,123,456 qux,000], " 123,456"], ["Inner garbage" => qw [123,foo,456 1a3,456], "123, 456"], ["Empty string" => ""], ["Garbage" => "wibble", "\n", "foo,123,bar"], ); foreach my $failure (@failures) { my ($reason, @subjects) = @$failure; foreach my $subject (@subjects) { $pattern_c -> no_match ($subject, reason => $reason); } } done_testing (); __END__ Regexp-Common-2016020301/t/number/141_integer_group.t000755 000765 000024 00000012323 12643716764 022527 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use warnings; no warnings 'syntax'; use Regexp::Common; use Test::More; my $r = eval "require Test::Regexp; 1"; unless ($r) { print "1..0 # SKIP Test::Regexp not found\n"; exit; } sub make_test { my ($name, $base, @options) = @_; my $pat = $base; while (@options) { my $opt = shift @options; if (@options && $options [0] !~ /^-/) { my $val = shift @options; $pat = $$pat {$opt => $val}; $name .= ", $opt => $val"; } else { $pat = $$pat {$opt}; $name .= ", $opt"; } } my $keep = $$pat {-keep}; Test::Regexp:: -> new -> init ( pattern => $pat, keep_pattern => $keep, name => $name, ); } my %patterns = ( 2 => make_test ("Integer pattern" => $RE {num} {int}, -sep => ",", -group => 2), 4 => make_test ("Integer pattern" => $RE {num} {int}, -sep => ",", -group => 4), 5_7 => make_test ("Integer pattern" => $RE {num} {int}, -sep => ",", -group => "5,7"), ); my %pass_numbers = ( 2 => [qw [0 00 0,00 00,00 1,23,45,67,89 12,34,56,78,90]], 4 => [qw [0 00 000 0000 0,0000 00,0000 1,2345,6789 12,3456,7890 123,0987,6782,1235]], 5_7 => [qw [0 00 000 0000 00000 000000 0000000 1,23456 12,3456789 8239317,54321 37819,4927658,897423,52904,3906817,34532]], ); foreach my $key (sort {$a cmp $b} keys %patterns) { my $pattern = $patterns {$key}; my $numbers = $pass_numbers {$key}; foreach my $number (@$numbers) { my $c = $number =~ y/,/,/; my $test = $c == 0 ? "No separator" : $c == 1 ? "Single separator" : "Multiple separators"; $pattern -> match ( $number => [ $number, "", $number], test => $test); $pattern -> match ("-$number" => ["-$number", "-", $number], test => "$test, signed (-)"); $pattern -> match ("+$number" => ["+$number", "+", $number], test => "$test, signed (+)"); } } my @failures = ( ["Wrong separator" => { 2 => [qw [0.00 1,23_46,79], "10 13"], 4 => [qw [0.0000 1,2345_6789], "1000 1313"], 5_7 => [qw [0.00000 1,23456_9876543], "10000 131313"], }], ["Leading separator" => { 2 => [qw [,123 ,456,789]], 4 => [qw [,1234 ,4567,7890]], 5_7 => [qw [,123456 ,98765,123456]], }], ["Trailing separator" => { 2 => [qw [123, 456,789,]], 4 => [qw [1234, 4567,1234,]], 5_7 => [qw [123456, 12345,0987654,]], }], ["Double separator" => { 2 => [qw [0,,00 23,45,,89]], 4 => [qw [0,,0000 123,4568,,1789]], 5_7 => [qw [0,,000000 123,456654,,789987]], }], ["No digits" => { 2 => [qw [, ,,]], }], ["Wrong number of digits in group" => { 2 => [qw [1,3,45 1,234,78 489,12 1,234,56,78]], 4 => [qw [13,45 1,23478,0000 11489,1212 1,23456,5678]], 5_7 => [qw [13,4589 1,12345678,98765 87654321,12345 123,4567,12345 123,456,123456]], }], ["Wrong number of digits in last group" => { 2 => [qw [12,4 45,678 1,23,456]], 4 => [qw [847,345 983,59025 123,4567,98387]], 5_7 => [qw [12,4567 89353,94768904 1490,49278,98765432]], }], ["Too many leading digits" => { 2 => [qw [000 123,45 948,89,90,23,24]], 4 => [qw [00000 12345,9489 899421,3890,2940]], 5_7 => [qw [00000000 89478211,904789 95872938,58903,1589387]], }], ["Trailing garbage" => { 2 => [qw [00foo 1,23,45,ba], "12,24 ", "12,24\n"], 4 => [qw [0000foo 1,2345,6789,barr], "12,2424 ", "12,2424\n"], 5_7 => [qw [00000foo 1,234567,67890,barrr], "12,242424 ", "12,242424\n"], }], ["Leading garbage" => { 2 => [qw [f1 foo12,34], " 12,34"], 4 => [qw [f123 foo1234,4567], " 1234,5678"], 5_7 => [qw [f12345 foo12340,04567], " 12340,05678"], }], ["Inner garbage" => { 2 => [qw [12,fo,56 1a,56], "13, 46"], 4 => [qw [1234,foob,5678 1a23,5678], "1234, 4567"], 5_7 => [qw [12345,foobar,5678901 1a23,45678], "1234, 456789"], }], ["Empty string" => { 2 => [""], 4 => [""], 5_7 => [""], }], ["Garbage" => { 2 => ["wibble", "\n", "fo,12,ar"], 4 => ["wibble", "\n", "foob,12,barb"], 5_7 => ["wibble", "\n", "foob,12345,barbaz"], }], ); foreach my $failure (@failures) { my ($reason, $data) = @$failure; foreach my $key (sort {$a cmp $b} keys %$data) { my $pattern = $patterns {$key}; my $subjects = $$data {$key}; foreach my $subject (@$subjects) { $pattern -> no_match ($subject, reason => $reason); } } } done_testing (); __END__ Regexp-Common-2016020301/t/number/decimal.t000755 000765 000024 00000010354 12117365661 020661 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib qw {blib/lib}; use Regexp::Common qw /RE_num_decimal/; use t::Common; my $decimal = $RE {num} {decimal}; # The following arrays contain valid numbers in the respective bases - # and the numbers aren't valid in the next array. my @data = ( [36 => [qw /regexp common perl5/]], [16 => [qw /deadbeaf c0c0a c1a0 55b/]], [10 => [qw /81320981536123490812346123 129 9/]], [ 8 => [qw /777 153/]], [ 2 => [qw /0 1 1010101110/]], ); my (%targets, @tests); foreach my $entry (@data) { my ($base, $list) = @$entry; $targets {"${base}_int"} = { list => $list, query => sub {$_ [0]}, wanted => sub {$_ [0], "", $_ [0], $_ [0], undef, undef} }; for my $exp ([dot => "."], [comma => ","]) { my ($name, $punct) = @$exp; $targets {"${base}_int_${name}"} = { list => $list, query => sub {$_ [0] . $punct}, wanted => sub {$_ [0] . $punct, "", $_ [0] . $punct, $_ [0], $punct, ""} }; $targets {"${base}_${name}_frac"} = { list => $list, query => sub {$_ [0] . $punct}, wanted => sub {$_ [0] . $punct, "", $_ [0] . $punct, $_ [0], $punct, ""} }; $targets {"${base}_minus_${name}_frac"} = { list => $list, query => sub {"-" . $_ [0] . $punct}, wanted => sub {"-" . $_ [0] . $punct, "-", $_ [0] . $punct, $_ [0], $punct, ""} }; $targets {"${base}_plus_${name}_frac"} = { list => $list, query => sub {"+" . $_ [0] . $punct}, wanted => sub {"+" . $_ [0] . $punct, "+", $_ [0] . $punct, $_ [0], $punct, ""} }; } $targets {"${base}_minus_int"} = { list => $list, query => sub {"-" . $_ [0]}, wanted => sub {"-" . $_ [0], "-", $_ [0], $_ [0], "", ""} }; $targets {"${base}_plus_int"} = { list => $list, query => sub {"+" . $_ [0]}, wanted => sub {"+" . $_ [0], "+", $_ [0], $_ [0], "", ""} }; } $targets {dot} = { list => ['.'], query => sub {$_ [0]}, }; sub __ { map {;"${_}_int", "${_}_int_dot", "${_}_minus_int", "${_}_plus_int", "${_}_dot_frac", "${_}_minus_dot_frac", "${_}_plus_dot_frac", } @_ } sub _2 { map {;"${_}_minus_int", "${_}_plus_int", "${_}_minus_dot_frac", "${_}_plus_dot_frac", } @_ } sub _3 { map {;"${_}_int", "${_}_int_dot", "${_}_dot_frac", } @_ } push @tests => { name => 'basic', re => $decimal, sub => \&RE_num_decimal, pass => [__ (grep {$_ <= 10} map {$$_ [0]} @data)], fail => [__ (grep {$_ > 10} map {$$_ [0]} @data), "dot"], }; push @tests => { name => 'basic -- signed', re => $decimal -> {-sign => '[-+]'}, sub => \&RE_num_decimal, sub_args => [-sign => '[-+]'], pass => [ _2 (grep {$_ <= 10} map {$$_ [0]} @data)], fail => [(_3 (grep {$_ <= 10} map {$$_ [0]} @data)), __ (grep {$_ > 10} map {$$_ [0]} @data), "dot"], }; foreach my $data (@data) { my $base = $$data [0]; my @passes = __ grep {$_ <= $base} map {$$_ [0]} @data; my @failures = __ grep {$_ > $base} map {$$_ [0]} @data; my @commas = grep {/^${base}_.*comma/} keys %targets; push @tests => { name => "base_${base}", re => $RE {num} {decimal} {-base => $base}, sub => \&RE_num_decimal, sub_args => [-base => $base], pass => [@passes], fail => [@failures, @commas, "dot"], }; push @tests => { name => "base_${base}_comma", re => $RE {num} {decimal} {-base => $base} {-radix => ","}, sub => \&RE_num_decimal, sub_args => [-base => $base, -radix => ","], pass => [(grep {!/dot/} @passes), @commas], fail => [(grep {/^${base}/} @failures)], }; } run_new_tests targets => \%targets, tests => \@tests, version_from => 'Regexp::Common::number', ; __END__ Regexp-Common-2016020301/t/number/number.t000755 000765 000024 00000001616 12116413566 020551 0ustar00abigailstaff000000 000000 #!/usr/bin/perl # # Test for the support functions of Regexp::Common::number # use strict; use lib qw {blib/lib}; use Regexp::Common; use t::Common; my @wrong_bases = (0, 40); my @correct_bases = (1, 29, 36); my @types = qw /decimal real/; my $tests = (@wrong_bases + @correct_bases) * @types; my $count = 0; print "1..$tests\n"; foreach my $base (@wrong_bases) { foreach my $type (@types) { eval {"" =~ $RE {num} {$type} {-base => $base}}; printf "%s %d - \$RE {num} {$type} {-base => $base}\n" => $@ && $@ =~ /Base must be between 1 and 36/ ? "ok" : "not ok", ++ $count; } } foreach my $base (@correct_bases) { foreach my $type (@types) { eval {"" =~ $RE {num} {$type} {-base => $base}}; printf "%s %d - \$RE {num} {$type} {-base => $base}\n" => $@ ? "not ok" : "ok", ++ $count; } } __END__ Regexp-Common-2016020301/t/comment/delimited.t000755 000765 000024 00000006122 12116413566 021370 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib "blib/lib"; use Regexp::Common qw /RE_comment_ALL/; use t::Common qw /run_new_tests/; use warnings; BEGIN {$^W = 0 if $] < 5.006} my @delimited = do { no warnings; ( [qw {comment ;} => ['Algol 60']], [qw {/* */} => [qw {ALPACA B C C-- LPC PL/I}]], [qw {; ;} => [qw {Befunge-98 Funge-98 Shelta}]], [qw {} => [qw {BML}]], [qw !{ }! => [qw {False}]], [qw {, ,} => [qw {Haifu}]], [qw {/** */} => [qw {JavaDoc}]], [qw {(* *)} => [qw {Oberon}]], [qw {" "} => [qw {Smalltalk}]], [qw {|| !!} => [qw {*W}]], ) }; # # Some basic comments, not including delimiters. # my @comments = ("", "This is a comment", "This is a\nmultiline comment", "\n", "*", "\n*\n", "/*", "(*", "||", "{"); # Targets, and test suites. my %targets; my @tests; foreach my $entry (@delimited) { my ($start, $end) = @$entry [0, 1]; my $langs = $$entry [2]; my $pass_key = "pass_${start}_${end}"; my $fail_key = "fail_${start}_${end}"; $targets {$pass_key} = { list => \@comments, query => sub {$start . $_ [0] . $end}, wanted => sub {$_, $start, $_ [0], $end}, }; # Create bad comments. my @bad_comments; # No terminating token. # Not for pre 5.006 perls due to tests taking too much time. push @bad_comments => map {"$start$_"} @comments if $] >= 5.006; # No starting token. push @bad_comments => map {"$_$end"} grep {index ($_, $start)} @comments; # Double terminators. push @bad_comments => map {"$start$_$end$end"} @comments; # Double the comment. push @bad_comments => map {"$start$_$end" x 2} @comments; # Different token. my @bad_tokens = grep {index $_ -> [0], $start} @delimited; push @bad_comments => map {my $c = $_; map {$_ -> [0] . $c . $_ -> [1]} @bad_tokens } @comments; # No tokens. push @bad_comments => @comments; # Text preceeding comment. push @bad_comments => map {"Text $start$_$end"} @comments; # Some more. push @bad_comments => ""; push @bad_comments => "/* This is a C comment */" if $start ne '/*'; push @bad_comments => "{ This is a Pascal comment }" if $start ne '{'; $targets {$fail_key} = { list => \@bad_comments, }; foreach my $lang (@$langs) { my $langX = $lang; $langX =~ s/\W/X/g; no strict 'refs'; push @tests => { name => $lang, regex => $RE {comment} {$lang}, sub => \&{"RE_comment_$langX"}, pass => [$pass_key], fail => [$fail_key], skip_sub => sub {$lang eq 'JavaDoc' && $_ [0] eq 'fail' && $_ [1] eq '/***/'}, } } } run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::comment', __END__ Regexp-Common-2016020301/t/comment/html.t000755 000765 000024 00000004640 12116413566 020377 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib "blib/lib"; use Regexp::Common qw /RE_comment_HTML/; use t::Common qw /run_new_tests cross/; use warnings; my @good = ("", "This is a comment", "This is - a comment", "This is - - comment", ">This is a comment", ">This is a comment<", "This is comment", ">", "<>", "><", "<"); my @spec = ("", ">", "->", " [0], ss, $_ -> [1], ss, $_ -> [2], ss]} cross \@spec, \@spec, \@spec; # Targets, and test suites. my %targets; my @tests; $targets {simple} = { list => \@good, query => sub {""}, wanted => sub {$_, ""}, }; $targets {simple_space} = { list => \@spaced, query => sub {""}, wanted => sub {$_, ""}, }; $targets {crossed2} = { list => \@cross3, query => sub {"", # Missing ! "", # Not enough dashes, "", # Too many starting dashes. "", # Space after ", # Garbage after comment )} @good; $targets {bad1} = { list => \@bad, }; $targets {bad2} = { list => \@crossed, query => sub {""}, }; push @tests => { name => 'HTML', regex => $RE {comment} {HTML}, pass => [qw /simple simple_space crossed crossed2/], fail => [qw /bad1 bad2/], sub => \&RE_comment_HTML, }; run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::comment', ; __END__ Regexp-Common-2016020301/t/comment/nested.t000755 000765 000024 00000015275 12116413566 020723 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib "blib/lib"; use Regexp::Common qw /pattern RE_comment_ALL/; use t::Common qw /run_new_tests cross ww/; use warnings; pattern name => [qw /comment fairy-language-1/], create => sub {my $re = Regexp::Common::comment::nested "-(-", "-)-"; exists $_ [1] -> {-keep} ? qr /($re)/ : qr /$re/ }, version => 5.006 ; pattern name => [qw /comment fairy-language-2/], create => sub {my $re = Regexp::Common::comment::nested "(", ")"; exists $_ [1] -> {-keep} ? qr /($re)/ : qr /$re/ }, version => 5.006 ; my @data = do { no warnings; ( { nested_tokens => [["(*" => "*)"]], languages => [qw /Caml Modula-2 Modula-3/], }, { start_tokens => ["//"], nested_tokens => [["/*" => "*/"]], languages => [qw /Dylan/], }, { start_tokens => ["--", "---", "-----"], nested_tokens => [["{-", "-}"]], languages => [qw /Haskell/], }, { start_tokens => ["!"], # Should not be followed by \ nested_tokens => [["!\\", "\\!"]], languages => [qw /Hugo/], }, { start_tokens => ["#"], nested_tokens => [["(*" => "*)"]], languages => [qw /SLIDE/], }, { nested_tokens => [["-(-" => "-)-"]], languages => [qw /fairy-language-1/], }, { nested_tokens => [["(" => ")"]], languages => [qw /fairy-language-2/], }, ); }; $$_ {start_tokens} ||= [] for @data; $$_ {nested_tokens} ||= [] for @data; my @s_tokens = do { my %h; grep {!$h {$_} ++} map {@{$$_ {start_tokens}}} @data }; my @pairs = do { my %h; grep {!$h {$$_ [0]} {$$_ [1]} ++} map {@{$$_ {nested_tokens}}} @data }; # # Create some comments. # my @comments = ("", "This is a comment", "This is a\nmultiline comment", "\n", map {" $_ "} @s_tokens); my @no_eol = grep {!/\n/} @comments; my (%targets, @tests); foreach my $s_token (@s_tokens) { my $pass_key = "start_pass_$s_token"; my $fail_key = "start_fail_$s_token"; $targets {$pass_key} = { list => \@no_eol, query => sub {$s_token . $_ [0] . "\n"}, }; # Build a list of "bad" comments. my @bad; # No trailing newline. push @bad => map {"$s_token$_"} @no_eol; # Double newline. push @bad => map {"$s_token$_\n\n"} @no_eol; # Double comment. push @bad => map {"$s_token$_\n" x 2} @no_eol; # Leading garbage. push @bad => map {ww (1, 10) . "$s_token$_\n"} @no_eol; # Trailing garbage. push @bad => map {"$s_token$_\n" . ww (1, 10)} @no_eol; $targets {$fail_key} = { list => \@bad, }; } my @parts = cross ["", "[]", "\n"], ["", "7^%", "\n"], ["", "comment", "\n"]; foreach my $pair (@pairs) { my ($start, $end) = @$pair; my $pass_key = "nested_pass_${start}_${end}"; my $fail_key = "nested_fail_${start}_${end}"; $targets {"${pass_key}_simple"} = { list => \@comments, query => sub {$start . $_ [0] . $end}, }; my @nested = map {$start . $$_ [0] . $start . $$_ [1] . $end . $$_ [2] . $end} @parts; $targets {"${pass_key}_nested"} = { list => \@nested, }; # Build a list of "bad" comments. my @bad; # No end token. push @bad => map {"$start$_"} @comments; # No begin token. push @bad => map {"$_$end"} @comments; # Double end token. push @bad => map {"$start$_$end$end"} @comments; # Double begin token. push @bad => map {"$start$start$_$end"} @comments; # Double comment. push @bad => map {"$start$_$end" x 2} @comments; # Leading garbage. push @bad => map {ww (1, 10) . "$start$_$end"} @comments; # Trailing garbage. push @bad => map {"$start$_$end" . ww (1, 10)} @comments; # Bad nested comments. # Extra start token. push @bad => map {"$start$_"} @nested; # Extra end token. push @bad => map {"$_$end"} @nested; # Leading garbage. push @bad => map {ww (1, 10) . $_} @nested; # Trailing garbage. push @bad => map {$_ . ww (1, 10)} @nested; # Double comment. push @bad => map {$_ x 2} @nested; $targets {$fail_key} = { list => \@bad }; } foreach my $data (@data) { foreach my $language (@{$$data {languages}}) { my (@passes, @failures); foreach my $my_token (@{$$data {start_tokens}}) { push @passes => "start_pass_$my_token"; push @failures => "start_fail_$my_token"; } foreach my $s_token (@s_tokens) { # Failure, unless there's a token that's a prefix of $s_token. my $ok = 1; foreach my $my_token (@{$$data {start_tokens}}) { $ok = 0 if index ($s_token, $my_token) == 0; } push @failures => "start_pass_$s_token" if $ok; } foreach my $my_pair (@{$$data {nested_tokens}}) { my ($my_start, $my_end) = @$my_pair; push @passes => "nested_pass_${my_start}_${my_end}_simple", "nested_pass_${my_start}_${my_end}_nested"; push @failures => "nested_fail_${my_start}_${my_end}"; } foreach my $pair (@pairs) { my ($start, $end) = @$pair; # Failure, unless there's a pair that fits. my $ok = 1; foreach my $my_pair (@{$$data {nested_tokens}}) { my ($my_start, $my_end) = @$my_pair; $ok = 0 if index ($start, $my_start) == 0 && rindex ($end, $my_end) == length ($end) - length ($my_end); } push @failures => "nested_pass_${start}_${end}_simple", "nested_pass_${start}_${end}_nested" if $ok; } (my $sub = "RE_comment_$language") =~ s/\W/X/g; my $test = { name => $language, re => $RE {comment} {$language}, pass => \@passes, fail => \@failures, }; # If we call 'pattern' after the 'use Regexp::Common', we won't # (can't) import a subroutine. no strict 'refs'; $$test {sub} = \&{$sub} if defined &{"main::$sub"}; push @tests => $test; } } run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::comment', version => 5.006, ; __END__ Regexp-Common-2016020301/t/comment/pascal.t000644 000765 000024 00000005173 12116413566 020675 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib "blib/lib"; use Regexp::Common qw /RE_comment_Pascal/; use t::Common qw /run_new_tests/; use warnings; BEGIN {$^W = 0 if $] < 5.006} my @open = qw [{ (*]; my @close = qw [} *)]; # # Some basic comments, not including delimiters. # my @comments = ("", "This is a comment", "This is a\nmultiline comment", "\n", "*", "\n*\n", "**", "*****", "** **", "/*", "||"); # Targets, and test suites. my %targets; my @tests; foreach my $start (@open) { foreach my $end (@close) { my $lang = "Pascal"; my $pass_key = "pass_${start}_${end}"; my $fail_key = "fail_${start}_${end}"; $targets {$pass_key} = { list => \@comments, query => sub {$start . $_ [0] . $end}, wanted => sub {$_, $start, $_ [0], $end}, }; # Create bad comments. my @bad_comments; # No terminating token. # Not for pre 5.006 perls due to tests taking too much time. push @bad_comments => map {"$start$_"} @comments if $] >= 5.006; # No starting token. push @bad_comments => map {"$_$end"} grep {index ($_, $start)} @comments; # Double terminators. push @bad_comments => map {"$start$_$end$end"} @comments; # Double the comment. push @bad_comments => map {"$start$_$end" x 2} @comments; # Different token. my @bad_open = qw [// /* --]; my @bad_close = (qw [*/ --], "\n"); foreach my $close (@close) { push @bad_comments => map {my $o = $_; map {"ot$_$close"} @comments} @bad_open; } foreach my $open (@open) { push @bad_comments => map {my $c = $_; map {"$open$_$c"} @comments} @bad_close; } # No tokens. push @bad_comments => @comments; # Text preceeding comment. push @bad_comments => map {"Text $start$_$end"} @comments; # Some more. push @bad_comments => ""; push @bad_comments => "/* This is a C comment */"; $targets {$fail_key} = { list => \@bad_comments, }; no strict 'refs'; push @tests => { name => $lang, regex => $RE {comment} {$lang}, sub => \&{"RE_comment_$lang"}, pass => [$pass_key], fail => [$fail_key], } } } run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::comment', __END__ Regexp-Common-2016020301/t/comment/single_line.t000755 000765 000024 00000014612 12116413566 021723 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib "blib/lib"; use Regexp::Common qw /RE_comment_ALL/; use t::Common qw /run_new_tests ww/; BEGIN {$^W = 0 if $] < 5.006}; use warnings; # 1. List of tokens. # 2. List of languages. my @data = do { no warnings; ( {start_tokens => ["\\"], # No qw here, 5.6.0 parses it incorrectly. languages => [qw {ABC Forth}], }, {start_tokens => [qw {# //}], languages => [qw {Advisor}], }, {start_tokens => [qw {--}], languages => [qw {Ada Alan Eiffel lua}], }, {start_tokens => [qw {;}], languages => [qw {Advsys CQL Lisp LOGO M MUMPS REBOL Scheme SMITH zonefile}], }, {start_tokens => [qw {#}], languages => [qw {awk fvwm2 Icon m4 mutt Perl Python QML R Ruby shell Tcl}], }, {start_tokens => [qw {* ! REM}], languages => [[BASIC => 'mvEnterprise']], }, {start_tokens => [qw {//}], languages => [qw {beta-Juliet Portia Ubercode}, q {Crystal Report}], }, {start_tokens => [qw {%}], languages => [qw {CLU LaTeX TeX slrn}], }, {start_tokens => [qw {!}], languages => [qw {Fortran}], }, {start_tokens => [qw {NB}], languages => [qw {ILLGOL}], }, {start_tokens => ["PLEASE NOT", "PLEASE NOT", "PLEASE N'T", "DO NOT", "DO N'T", "DO NOT", "PLEASE DO NOT", "PLEASE DO NOT", "PLEASE DO N'T"], languages => [qw {INTERCAL}]}, {start_tokens => [qw {NB.}], languages => [qw {J}], }, {start_tokens => [qw !{!], languages => [[qw {Pascal Alice}]], end_tokens => [qw !}!], }, {start_tokens => [qw {. ;}], languages => [qw {PL/B}], }, {start_tokens => [qw {`}], languages => [qw {Q-BAL}], }, {start_tokens => [qw {-- --- -----}], languages => [qw {SQL}], # SQL comments start with /-{2,}/ }, {start_tokens => ['\\"'], # No qw here, 5.6.0 parses it incorrectly. languages => [qw {troff}], }, {start_tokens => [qw {"}], languages => [qw {vi}], }, {start_tokens => [qw {'}], languages => [qw {ZZT-OOP}], }, ); }; # # Extract the markers. # # my @tokens = map {@{$$_ {start_tokens}}} @data; my @tokens; foreach my $data (@data) { if ($$data {end_tokens}) { push @tokens => map {[$$data {start_tokens} [$_] => $$data {end_tokens} [$_]]} 0 .. $#{$$data {start_tokens}}; } else { push @tokens => map {[$_ => "\n"]} @{$$data {start_tokens}} } } # # Some basic comments, not including delimiters. # my @comments = ("", "This is a comment", "A\tcomment", "Another /* comment"); # Targets, and test suites. my %targets; my @tests; my @bad; foreach my $token (@tokens) { my ($start, $end) = @$token; my $pass_key = "pass_${start}_${end}"; my $fail_key = "fail_${start}_${end}"; my @my_bad; $targets {$pass_key} = { list => \@comments, query => sub {$start . $_ [0] . $end}, wanted => sub {$_, $start, $_ [0], $end}, }; # No trailing newline. push @bad => map {"$start$_"} @comments; # No leading token. push @bad => map {"$_$end"} @comments; # Double newlines. push @my_bad => map {"$start$_$end$end"} @comments; # Double comments. push @my_bad => map {"$start$_$end" x 2} @comments; # Garbage trailing the comments. push @my_bad => map {"$start$_$end" . ww (1, 5)} @comments; # Garbage leading the comments. push @my_bad => map {ww (1, 5) . "$start$_$end"} @comments; $targets {$fail_key} = { list => \@my_bad } } # A few extras. push @bad => ("/* This is a C comment */", "(* This is a Pascal comment *)", ""); $targets {bad} = { list => \@bad }; foreach my $entry (@data) { my ($start_tokens, $langs) = @$entry {qw /start_tokens languages/}; my $end_tokens = $$entry {end_tokens} ? $$entry {end_tokens} : [("\n") x @$start_tokens]; my @my_tokens = map {[$$start_tokens [$_], $$end_tokens [$_]]} 0 .. $#$start_tokens; my %my_tokens = map {$_ => 1} map {join _ => $$start_tokens [$_], $$end_tokens [$_]} 0 .. $#$start_tokens; my @pass_tokens = map {join _ => "pass", $$start_tokens [$_], $$end_tokens [$_]} 0 .. $#$start_tokens; # # Find out what should fail. # # 1. A global 'bad' list. # my @fail_tokens = ("bad"); # # 2. Failures for our tokens. # push @fail_tokens => map {join _ => "fail", $$start_tokens [$_], $$end_tokens [$_]} 0 .. $#$start_tokens; # # 3. Passes for tokens that aren't ours, and don't "fit" ours. # TOKEN: foreach my $token (@tokens) { my ($start, $end) = @$token; foreach my $my_token (@my_tokens) { my ($my_start, $my_end) = @$my_token; if ($start =~ /^\Q$my_start\E/ && $end =~ /\Q$my_end\E$/) { next TOKEN; } } push @fail_tokens => join _ => pass => @$token; } foreach my $lang (@$langs) { my $name = ref $lang ? join "/" => @$lang : $lang; my $re = ref $lang ? $RE {comment} {$lang -> [0]} {$lang -> [1]} : $RE {comment} {$lang}; my $sub = ref $lang ? join "_" => "RE_comment", @$lang : "RE_comment_$lang"; $sub =~ s/\W/X/g; no strict 'refs'; push @tests => { name => $name, regex => $re, sub => \&$sub, pass => \@pass_tokens, fail => \@fail_tokens, }; } } run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::comment', __END__ Regexp-Common-2016020301/t/comment/single_or_multiline.t000755 000765 000024 00000012605 12116413566 023476 0ustar00abigailstaff000000 000000 #!/usr/bin/perl use strict; use lib "blib/lib"; use Regexp::Common qw /RE_comment_ALL/; use t::Common qw /run_new_tests ww/; use warnings; BEGIN {$^W = 0 if $] < 5.006;} # 1. tokens for single line comments. # 2. start/end tokens for multi-line comments. # 3. list of languages this applies to. my @data = do { no warnings; ( [[qw {//}] => [[qw {/* */}]] => [qw {C++ C# Cg ECMAScript FPL Java JavaScript}], ], [[qw {#}] => [[qw {/* */}]] => [qw {Nickle}], ], [[qw {//}] => [[qw !{ }!], [qw !(* *)!]] => [[qw /Pascal Delphi/], [qw /Pascal Free/], [qw /Pascal GPC/]], ], [[qw {!}] => [[qw {/* */}]] => [qw {PEARL}] ], [[qw {# //}] => [[qw {/* */}]] => [qw {PHP}] ], [[qw {--}] => [[qw {/* */}]] => [qw {PL/SQL}] ] ); }; # Grab the single line tokens. my @s_tokens = do {my %h; grep {!$h {$_} ++} map {@{$$_ [0]}} @data}; # Grab the multiline line tokens. my @mo_tokens = do {my %h; grep {!$h {$_} ++} map {map {$$_ [0]} @{$$_ [1]}} @data}; my @mc_tokens = do {my %h; grep {!$h {$_} ++} map {map {$$_ [1]} @{$$_ [1]}} @data}; my @comments = ("", "This is a comment", "This is a\nmultiline comment", "\n", (map {" $_ "} @s_tokens, @mo_tokens, @mc_tokens)); my @no_eol = grep {!/\n/} @comments; # Targets, and test suites. my %targets; my @tests; my @bad; # Tests for the single line comments (including failures). foreach my $token (@s_tokens) { my $key = "single_$token"; my $fkey = "single_fail_$token"; $targets {$key} = { list => \@no_eol, query => sub {$token . $_ [0] . "\n"}, }; my @s_bad; # No trailing newline. push @s_bad => map {"$token$_"} @no_eol; # Double newline. push @s_bad => map {"$token$_\n\n"} @no_eol; # Double comment. push @s_bad => map {"$token$_\n" x 2} @no_eol; # Leading garbage. push @s_bad => map {ww (1, 10) . "$token$_\n"} @no_eol; # Trailing garbage. push @s_bad => map {"$token$_\n" . ww (1, 10)} @no_eol; $targets {$fkey} = { list => \@s_bad, }; } # No leading token. $targets {single_fail} = { list => [map {"$_\n"} @no_eol], }; # Tests for the multi line comments (including failures). for (my $i = 0; $i < @mc_tokens; $i ++) { my $start = $mo_tokens [$i]; my $end = $mc_tokens [$i]; my $key = "multi_${start}_$end"; my $key2 = "multi2_${start}_$end"; my $fkey = "multi_fail_${start}_$end"; my @list = grep {!/\Q$end/} @comments; $targets {$key} = { list => \@list, query => sub {$start . $_ [0] . $end}, }; # Doubling the start token should be ok. $targets {$key2} = { list => \@list, query => sub {$start . $start . $_ [0] . $end}, }; my @m_bad; # No starting token. push @m_bad => map {"$_$end"} @comments; # No ending token. # Hack for old versions of Perl. The regexes will # work there, but it just takes too long to test them. push @m_bad => map {"$start$_"} $] < 5.006 ? @no_eol : @comments; # Double the comment. push @m_bad => map {"$start$_$end" x 2} @comments; # Leading garbage. push @m_bad => map {ww (1, 5) . "$start$_$end"} @comments; # Trailing garbage. push @m_bad => map {"$start$_$end" . ww (1, 5)} @comments; $targets {$fkey} = { list => \@m_bad, }; } # No tokens at all. $targets {fail} = { list => \@comments, }; foreach my $data (@data) { my ($singles, $doubles, $langs) = @$data; my %s_seen; my %m_seen; $s_seen {$_} = 1 for @$singles; $m_seen {join "_" => @$_} = 1 for @$doubles; my @passes = map {"single_$_"} @$singles; push @passes => map {join _ => "multi", @$_} @$doubles; push @passes => map {join _ => "multi2", @$_} @$doubles; my @failures = map {"single_$_"} grep {!$s_seen {$_}} @s_tokens; push @failures => map {"single_fail_$_"} @$singles; push @failures => "single_fail"; # Multiline comments using *other* delimiters. push @failures => map {join _ => "multi", $_} grep {!$m_seen {$_}} map {join _ => $mo_tokens [$_], $mc_tokens [$_]} 0 .. $#mo_tokens; push @failures => map {join _ => "multi_fail", @$_} @$doubles; push @failures => "fail"; foreach my $lang (@$langs) { my $name = ref $lang ? join "/" => @$lang : $lang; my $sub = ref $lang ? join "_" => "RE_comment", @$lang : "RE_comment_$lang"; $sub =~ s/\W/X/g; my $re = ref $lang ? $RE {comment} {$$lang [0]} {$$lang [1]} : $RE {comment} {$lang}; no strict 'refs'; push @tests => { name => $name, re => $re, sub => \&{$sub}, pass => \@passes, fail => \@failures, }; } } run_new_tests tests => \@tests, targets => \%targets, version_from => 'Regexp::Common::comment', __END__ Regexp-Common-2016020301/lib/Regexp/000755 000765 000024 00000000000 12654450573 017340 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/lib/Regexp/Common/000755 000765 000024 00000000000 12654450573 020570 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/lib/Regexp/Common.pm000644 000765 000024 00000064346 12654447171 021143 0ustar00abigailstaff000000 000000 package Regexp::Common; use 5.00473; use strict; BEGIN { # This makes sure 'use warnings' doesn't bomb out on 5.005_*; # warnings won't be enabled on those old versions though. # Since all other files use this file, we can use 'use warnings' # elsewhere as well, but *AFTER* 'use Regexp::Common'. if ($] < 5.006) { $INC {"warnings.pm"} = 1; no strict 'refs'; *{"warnings::unimport"} = sub {0}; } } use warnings; use vars qw /$VERSION %RE %sub_interface $AUTOLOAD/; $VERSION = '2016020301'; sub _croak { require Carp; goto &Carp::croak; } sub _carp { require Carp; goto &Carp::carp; } sub new { my ($class, @data) = @_; my %self; tie %self, $class, @data; return \%self; } sub TIEHASH { my ($class, @data) = @_; bless \@data, $class; } sub FETCH { my ($self, $extra) = @_; return bless ref($self)->new(@$self, $extra), ref($self); } my %imports = map {$_ => "Regexp::Common::$_"} qw /balanced CC comment delimited lingua list net number profanity SEN URI whitespace zip/; sub import { shift; # Shift off the class. tie %RE, __PACKAGE__; { no strict 'refs'; *{caller() . "::RE"} = \%RE; } my $saw_import; my $no_defaults; my %exclude; foreach my $entry (grep {!/^RE_/} @_) { if ($entry eq 'pattern') { no strict 'refs'; *{caller() . "::pattern"} = \&pattern; next; } # This used to prevent $; from being set. We still recognize it, # but we won't do anything. if ($entry eq 'clean') { next; } if ($entry eq 'no_defaults') { $no_defaults ++; next; } if (my $module = $imports {$entry}) { $saw_import ++; eval "require $module;"; die $@ if $@; next; } if ($entry =~ /^!(.*)/ && $imports {$1}) { $exclude {$1} ++; next; } # As a last resort, try to load the argument. my $module = $entry =~ /^Regexp::Common/ ? $entry : "Regexp::Common::" . $entry; eval "require $module;"; die $@ if $@; } unless ($saw_import || $no_defaults) { foreach my $module (values %imports) { next if $exclude {$module}; eval "require $module;"; die $@ if $@; } } my %exported; foreach my $entry (grep {/^RE_/} @_) { if ($entry =~ /^RE_(\w+_)?ALL$/) { my $m = defined $1 ? $1 : ""; my $re = qr /^RE_${m}.*$/; while (my ($sub, $interface) = each %sub_interface) { next if $exported {$sub}; next unless $sub =~ /$re/; { no strict 'refs'; *{caller() . "::$sub"} = $interface; } $exported {$sub} ++; } } else { next if $exported {$entry}; _croak "Can't export unknown subroutine &$entry" unless $sub_interface {$entry}; { no strict 'refs'; *{caller() . "::$entry"} = $sub_interface {$entry}; } $exported {$entry} ++; } } } sub AUTOLOAD { _croak "Can't $AUTOLOAD" } sub DESTROY {} my %cache; my $fpat = qr/^(-\w+)/; sub _decache { my @args = @{tied %{$_[0]}}; my @nonflags = grep {!/$fpat/} @args; my $cache = get_cache(@nonflags); _croak "Can't create unknown regex: \$RE{" . join("}{",@args) . "}" unless exists $cache->{__VAL__}; _croak "Perl $] does not support the pattern " . "\$RE{" . join("}{",@args) . "}.\nYou need Perl $cache->{__VAL__}{version} or later" unless ($cache->{__VAL__}{version}||0) <= $]; my %flags = ( %{$cache->{__VAL__}{default}}, map { /$fpat\Q$;\E(.*)/ ? ($1 => $2) : /$fpat/ ? ($1 => undef) : () } @args); $cache->{__VAL__}->_clone_with(\@args, \%flags); } use overload q{""} => \&_decache; sub get_cache { my $cache = \%cache; foreach (@_) { $cache = $cache->{$_} || ($cache->{$_} = {}); } return $cache; } sub croak_version { my ($entry, @args) = @_; } sub pattern { my %spec = @_; _croak 'pattern() requires argument: name => [ @list ]' unless $spec{name} && ref $spec{name} eq 'ARRAY'; _croak 'pattern() requires argument: create => $sub_ref_or_string' unless $spec{create}; if (ref $spec{create} ne "CODE") { my $fixed_str = "$spec{create}"; $spec{create} = sub { $fixed_str } } my @nonflags; my %default; foreach ( @{$spec{name}} ) { if (/$fpat=(.*)/) { $default{$1} = $2; } elsif (/$fpat\s*$/) { $default{$1} = undef; } else { push @nonflags, $_; } } my $entry = get_cache(@nonflags); if ($entry->{__VAL__}) { _carp "Overriding \$RE{" . join("}{",@nonflags) . "}"; } $entry->{__VAL__} = bless { create => $spec{create}, match => $spec{match} || \&generic_match, subs => $spec{subs} || \&generic_subs, version => $spec{version}, default => \%default, }, 'Regexp::Common::Entry'; foreach (@nonflags) {s/\W/X/g} my $subname = "RE_" . join ("_", @nonflags); $sub_interface{$subname} = sub { push @_ => undef if @_ % 2; my %flags = @_; my $pat = $spec{create}->($entry->{__VAL__}, {%default, %flags}, \@nonflags); if (exists $flags{-keep}) { $pat =~ s/\Q(?k:/(/g; } else { $pat =~ s/\Q(?k:/(?:/g; } return exists $flags {-i} ? qr /(?i:$pat)/ : qr/$pat/; }; return 1; } sub generic_match {$_ [1] =~ /$_[0]/} sub generic_subs {$_ [1] =~ s/$_[0]/$_[2]/} sub matches { my ($self, $str) = @_; my $entry = $self -> _decache; $entry -> {match} -> ($entry, $str); } sub subs { my ($self, $str, $newstr) = @_; my $entry = $self -> _decache; $entry -> {subs} -> ($entry, $str, $newstr); return $str; } package Regexp::Common::Entry; # use Carp; use overload q{""} => sub { my ($self) = @_; my $pat = $self->{create}->($self, $self->{flags}, $self->{args}); if (exists $self->{flags}{-keep}) { $pat =~ s/\Q(?k:/(/g; } else { $pat =~ s/\Q(?k:/(?:/g; } if (exists $self->{flags}{-i}) { $pat = "(?i)$pat" } return $pat; }; sub _clone_with { my ($self, $args, $flags) = @_; bless { %$self, args=>$args, flags=>$flags }, ref $self; } 1; __END__ =pod =head1 NAME Regexp::Common - Provide commonly requested regular expressions =head1 SYNOPSIS # STANDARD USAGE use Regexp::Common; while (<>) { /$RE{num}{real}/ and print q{a number}; /$RE{quoted}/ and print q{a ['"`] quoted string}; /$RE{delimited}{-delim=>'/'}/ and print q{a /.../ sequence}; /$RE{balanced}{-parens=>'()'}/ and print q{balanced parentheses}; /$RE{profanity}/ and print q{a #*@%-ing word}; } # SUBROUTINE-BASED INTERFACE use Regexp::Common 'RE_ALL'; while (<>) { $_ =~ RE_num_real() and print q{a number}; $_ =~ RE_quoted() and print q{a ['"`] quoted string}; $_ =~ RE_delimited(-delim=>'/') and print q{a /.../ sequence}; $_ =~ RE_balanced(-parens=>'()'} and print q{balanced parentheses}; $_ =~ RE_profanity() and print q{a #*@%-ing word}; } # IN-LINE MATCHING... if ( $RE{num}{int}->matches($text) ) {...} # ...AND SUBSTITUTION my $cropped = $RE{ws}{crop}->subs($uncropped); # ROLL-YOUR-OWN PATTERNS use Regexp::Common 'pattern'; pattern name => ['name', 'mine'], create => '(?i:J[.]?\s+A[.]?\s+Perl-Hacker)', ; my $name_matcher = $RE{name}{mine}; pattern name => [ 'lineof', '-char=_' ], create => sub { my $flags = shift; my $char = quotemeta $flags->{-char}; return '(?:^$char+$)'; }, match => sub { my ($self, $str) = @_; return $str !~ /[^$self->{flags}{-char}]/; }, subs => sub { my ($self, $str, $replacement) = @_; $_[1] =~ s/^$self->{flags}{-char}+$//g; }, ; my $asterisks = $RE{lineof}{-char=>'*'}; # DECIDING WHICH PATTERNS TO LOAD. use Regexp::Common qw /comment number/; # Comment and number patterns. use Regexp::Common qw /no_defaults/; # Don't load any patterns. use Regexp::Common qw /!delimited/; # All, but delimited patterns. =head1 DESCRIPTION By default, this module exports a single hash (C<%RE>) that stores or generates commonly needed regular expressions (see L<"List of available patterns">). There is an alternative, subroutine-based syntax described in L<"Subroutine-based interface">. =head2 General syntax for requesting patterns To access a particular pattern, C<%RE> is treated as a hierarchical hash of hashes (of hashes...), with each successive key being an identifier. For example, to access the pattern that matches real numbers, you specify: $RE{num}{real} and to access the pattern that matches integers: $RE{num}{int} Deeper layers of the hash are used to specify I: arguments that modify the resulting pattern in some way. The keys used to access these layers are prefixed with a minus sign and may have a value; if a value is given, it's done by using a multidimensional key. For example, to access the pattern that matches base-2 real numbers with embedded commas separating groups of three digits (e.g. 10,101,110.110101101): $RE{num}{real}{-base => 2}{-sep => ','}{-group => 3} Through the magic of Perl, these flag layers may be specified in any order (and even interspersed through the identifier keys!) so you could get the same pattern with: $RE{num}{real}{-sep => ','}{-group => 3}{-base => 2} or: $RE{num}{-base => 2}{real}{-group => 3}{-sep => ','} or even: $RE{-base => 2}{-group => 3}{-sep => ','}{num}{real} etc. Note, however, that the relative order of amongst the identifier keys I significant. That is: $RE{list}{set} would not be the same as: $RE{set}{list} =head2 Flag syntax In versions prior to 2.113, flags could also be written as C<{"-flag=value"}>. This no longer works, although C<{"-flag$;value"}> still does. However, C<< {-flag => 'value'} >> is the preferred syntax. =head2 Universal flags Normally, flags are specific to a single pattern. However, there is two flags that all patterns may specify. =over 4 =item C<-keep> By default, the patterns provided by C<%RE> contain no capturing parentheses. However, if the C<-keep> flag is specified (it requires no value) then any significant substrings that the pattern matches are captured. For example: if ($str =~ $RE{num}{real}{-keep}) { $number = $1; $whole = $3; $decimals = $5; } Special care is needed if a "kept" pattern is interpolated into a larger regular expression, as the presence of other capturing parentheses is likely to change the "number variables" into which significant substrings are saved. See also L<"Adding new regular expressions">, which describes how to create new patterns with "optional" capturing brackets that respond to C<-keep>. =item C<-i> Some patterns or subpatterns only match lowercase or uppercase letters. If one wants the do case insensitive matching, one option is to use the C regexp modifier, or the special sequence C<(?i)>. But if the functional interface is used, one does not have this option. The C<-i> switch solves this problem; by using it, the pattern will do case insensitive matching. =back =head2 OO interface and inline matching/substitution The patterns returned from C<%RE> are objects, so rather than writing: if ($str =~ /$RE{some}{pattern}/ ) {...} you can write: if ( $RE{some}{pattern}->matches($str) ) {...} For matching this would seem to have no great advantage apart from readability (but see below). For substitutions, it has other significant benefits. Frequently you want to perform a substitution on a string without changing the original. Most people use this: $changed = $original; $changed =~ s/$RE{some}{pattern}/$replacement/; The more adept use: ($changed = $original) =~ s/$RE{some}{pattern}/$replacement/; Regexp::Common allows you do write this: $changed = $RE{some}{pattern}->subs($original=>$replacement); Apart from reducing precedence-angst, this approach has the added advantages that the substitution behaviour can be optimized from the regular expression, and the replacement string can be provided by default (see L<"Adding new regular expressions">). For example, in the implementation of this substitution: $cropped = $RE{ws}{crop}->subs($uncropped); the default empty string is provided automatically, and the substitution is optimized to use: $uncropped =~ s/^\s+//; $uncropped =~ s/\s+$//; rather than: $uncropped =~ s/^\s+|\s+$//g; =head2 Subroutine-based interface The hash-based interface was chosen because it allows regexes to be effortlessly interpolated, and because it also allows them to be "curried". For example: my $num = $RE{num}{int}; my $commad = $num->{-sep=>','}{-group=>3}; my $duodecimal = $num->{-base=>12}; However, the use of tied hashes does make the access to Regexp::Common patterns slower than it might otherwise be. In contexts where impatience overrules laziness, Regexp::Common provides an additional subroutine-based interface. For each (sub-)entry in the C<%RE> hash (C<$RE{key1}{key2}{etc}>), there is a corresponding exportable subroutine: C. The name of each subroutine is the underscore-separated concatenation of the I keys that locate the same pattern in C<%RE>. Flags are passed to the subroutine in its argument list. Thus: use Regexp::Common qw( RE_ws_crop RE_num_real RE_profanity ); $str =~ RE_ws_crop() and die "Surrounded by whitespace"; $str =~ RE_num_real(-base=>8, -sep=>" ") or next; $offensive = RE_profanity(-keep); $str =~ s/$offensive/$bad{$1}++; ""/ge; Note that, unlike the hash-based interface (which returns objects), these subroutines return ordinary C'd regular expressions. Hence they do not curry, nor do they provide the OO match and substitution inlining described in the previous section. It is also possible to export subroutines for all available patterns like so: use Regexp::Common 'RE_ALL'; Or you can export all subroutines with a common prefix of keys like so: use Regexp::Common 'RE_num_ALL'; which will export C and C (and if you have create more patterns who have first key I, those will be exported as well). In general, I will export all subroutines whose pattern names have first keys I ... I. =head2 Adding new regular expressions You can add your own regular expressions to the C<%RE> hash at run-time, using the exportable C subroutine. It expects a hash-like list of key/value pairs that specify the behaviour of the pattern. The various possible argument pairs are: =over 4 =item C [ @list ]> A required argument that specifies the name of the pattern, and any flags it may take, via a reference to a list of strings. For example: pattern name => [qw( line of -char )], # other args here ; This specifies an entry C<$RE{line}{of}>, which may take a C<-char> flag. Flags may also be specified with a default value, which is then used whenever the flag is specified without an explicit value (but not when the flag is omitted). For example: pattern name => [qw( line of -char=_ )], # default char is '_' # other args here ; =item C $sub_ref_or_string> A required argument that specifies either a string that is to be returned as the pattern: pattern name => [qw( line of underscores )], create => q/(?:^_+$)/ ; or a reference to a subroutine that will be called to create the pattern: pattern name => [qw( line of -char=_ )], create => sub { my ($self, $flags) = @_; my $char = quotemeta $flags->{-char}; return '(?:^$char+$)'; }, ; If the subroutine version is used, the subroutine will be called with three arguments: a reference to the pattern object itself, a reference to a hash containing the flags and their values, and a reference to an array containing the non-flag keys. Whatever the subroutine returns is stringified as the pattern. No matter how the pattern is created, it is immediately postprocessed to include or exclude capturing parentheses (according to the value of the C<-keep> flag). To specify such "optional" capturing parentheses within the regular expression associated with C, use the notation C<(?k:...)>. Any parentheses of this type will be converted to C<(...)> when the C<-keep> flag is specified, or C<(?:...)> when it is not. It is a Regexp::Common convention that the outermost capturing parentheses always capture the entire pattern, but this is not enforced. =item C $sub_ref> An optional argument that specifies a subroutine that is to be called when the C<$RE{...}-Ematches(...)> method of this pattern is invoked. The subroutine should expect two arguments: a reference to the pattern object itself, and the string to be matched against. It should return the same types of values as a C does. pattern name => [qw( line of -char )], create => sub {...}, match => sub { my ($self, $str) = @_; $str !~ /[^$self->{flags}{-char}]/; }, ; =item C $sub_ref> An optional argument that specifies a subroutine that is to be called when the C<$RE{...}-Esubs(...)> method of this pattern is invoked. The subroutine should expect three arguments: a reference to the pattern object itself, the string to be changed, and the value to be substituted into it. The third argument may be C, indicating the default substitution is required. The subroutine should return the same types of values as an C does. For example: pattern name => [ 'lineof', '-char=_' ], create => sub {...}, subs => sub { my ($self, $str, $ignore_replacement) = @_; $_[1] =~ s/^$self->{flags}{-char}+$//g; }, ; Note that such a subroutine will almost always need to modify C<$_[1]> directly. =item C $minimum_perl_version> If this argument is given, it specifies the minimum version of perl required to use the new pattern. Attempts to use the pattern with earlier versions of perl will generate a fatal diagnostic. =back =head2 Loading specific sets of patterns. By default, all the sets of patterns listed below are made available. However, it is possible to indicate which sets of patterns should be made available - the wanted sets should be given as arguments to C. Alternatively, it is also possible to indicate which sets of patterns should not be made available - those sets will be given as argument to the C statement, but are preceded with an exclaimation mark. The argument I indicates none of the default patterns should be made available. This is useful for instance if all you want is the C subroutine. Examples: use Regexp::Common qw /comment number/; # Comment and number patterns. use Regexp::Common qw /no_defaults/; # Don't load any patterns. use Regexp::Common qw /!delimited/; # All, but delimited patterns. It's also possible to load your own set of patterns. If you have a module C that makes patterns available, you can have it made available with use Regexp::Common qw /my_patterns/; Note that the default patterns will still be made available - only if you use I, or mention one of the default sets explicitly, the non mentioned defaults aren't made available. =head2 List of available patterns The patterns listed below are currently available. Each set of patterns has its own manual page describing the details. For each pattern set named I, the manual page I describes the details. Currently available are: =over 4 =item Regexp::Common::balanced Provides regexes for strings with balanced parenthesized delimiters. =item Regexp::Common::comment Provides regexes for comments of various languages (43 languages currently). =item Regexp::Common::delimited Provides regexes for delimited strings. =item Regexp::Common::lingua Provides regexes for palindromes. =item Regexp::Common::list Provides regexes for lists. =item Regexp::Common::net Provides regexes for IPv4 addresses and MAC addresses. =item Regexp::Common::number Provides regexes for numbers (integers and reals). =item Regexp::Common::profanity Provides regexes for profanity. =item Regexp::Common::whitespace Provides regexes for leading and trailing whitespace. =item Regexp::Common::zip Provides regexes for zip codes. =back =head2 Forthcoming patterns and features Future releases of the module will also provide patterns for the following: * email addresses * HTML/XML tags * more numerical matchers, * mail headers (including multiline ones), * more URLS * telephone numbers of various countries * currency (universal 3 letter format, Latin-1, currency names) * dates * binary formats (e.g. UUencoded, MIMEd) If you have other patterns or pattern generators that you think would be generally useful, please send them to the maintainer -- preferably as source code using the C subroutine. Submissions that include a set of tests will be especially welcome. =head1 DIAGNOSTICS =over 4 =item C The subroutine-based interface didn't recognize the requested subroutine. Often caused by a spelling mistake or an incompletely specified name. =item C Regexp::Common doesn't have a generator for the requested pattern. Often indicates a misspelt or missing parameter. =item C The requested pattern requires advanced regex features (e.g. recursion) that not available in your version of Perl. Time to upgrade. =item C<< pattern() requires argument: name => [ @list ] >> Every user-defined pattern specification must have a name. =item C<< pattern() requires argument: create => $sub_ref_or_string >> Every user-defined pattern specification must provide a pattern creation mechanism: either a pattern string or a reference to a subroutine that returns the pattern string. =item C The C<< $RE{num}{real}{-base=>'I'} >> pattern uses the characters [0-9A-Z] to represent the digits of various bases. Hence it only produces regular expressions for bases up to hexatricensimal. =item C The pattern has no default delimiter. You need to write: C<< $RE{delimited}{-delim=>I'} >> for some character I =back =head1 ACKNOWLEDGEMENTS Deepest thanks to the many people who have encouraged and contributed to this project, especially: Elijah, Jarkko, Tom, Nat, Ed, and Vivek. Further thanks go to: Alexandr Ciornii, Blair Zajac, Bob Stockdale, Charles Thomas, Chris Vertonghen, the CPAN Testers, David Hand, Fany, Geoffrey Leach, Hermann-Marcus Behrens, Jerome Quelin, Jim Cromie, Lars Wilke, Linda Julien, Mike Arms, Mike Castle, Mikko, Murat Uenalan, RafaE<235>l Garcia-Suarez, Ron Savage, Sam Vilain, Slaven Rezic, Smylers, Tim Maher, and all the others I've forgotten. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. There are some POD issues when installing this module using a pre-5.6.0 perl; some manual pages may not install, or may not install correctly using a perl that is that old. You might consider upgrading your perl. =head1 NOT A BUG =over 4 =item * The various patterns are not anchored. That is, a pattern like C<< $RE {num} {int} >> will match against "abc4def", because a substring of the subject matches. This is by design, and not a bug. If you want the pattern to be anchored, use something like: my $integer = $RE {num} {int}; $subj =~ /^$integer$/ and print "Matches!\n"; =back =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2011, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. Regexp-Common-2016020301/lib/Regexp/Common/_support.pm000644 000765 000024 00000004466 12654447171 023013 0ustar00abigailstaff000000 000000 package Regexp::Common::_support; BEGIN { # This makes sure 'use warnings' doesn't bomb out on 5.005_*; # warnings won't be enabled on those old versions though. if ($] < 5.006 && !exists $INC {"warnings.pm"}) { $INC {"warnings.pm"} = 1; no strict 'refs'; *{"warnings::unimport"} = sub {0}; } } use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; # # Returns true/false, depending whether the given the argument # satisfies the LUHN checksum. # See http://www.webopedia.com/TERM/L/Luhn_formula.html. # # Note that this function is intended to be called from regular # expression, so it should NOT use a regular expression in any way. # sub luhn { my $arg = shift; my $even = 0; my $sum = 0; while (length $arg) { my $num = chop $arg; return if $num lt '0' || $num gt '9'; if ($even && (($num *= 2) > 9)) {$num = 1 + ($num % 10)} $even = 1 - $even; $sum += $num; } !($sum % 10) } sub import { my $pack = shift; my $caller = caller; no strict 'refs'; *{$caller . "::" . $_} = \&{$pack . "::" . $_} for @_; } 1; __END__ =pod =head1 NAME Regexp::Common::support -- Support functions for Regexp::Common. =head1 SYNOPSIS use Regexp::Common::_support qw /luhn/; luhn ($number) # Returns true/false. =head1 DESCRIPTION This module contains some subroutines to be used by other C modules. It's not intended to be used directly. Subroutines from the module may disappear without any notice, or their meaning or interface may change without notice. =over 4 =item luhn This subroutine returns true if its argument passes the luhn checksum test. =back =head1 SEE ALSO L. =head1 AUTHOR Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/balanced.pm000644 000765 000024 00000012427 12654447171 022665 0ustar00abigailstaff000000 000000 package Regexp::Common::balanced; { use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my %closer = ( '{'=>'}', '('=>')', '['=>']', '<'=>'>' ); my %cache; sub nested { my ($start, $finish) = @_; return $cache {$start} {$finish} if exists $cache {$start} {$finish}; my @starts = map {s/\\(.)/$1/g; $_} grep {length} $start =~ /([^|\\]+|\\.)+/gs; my @finishes = map {s/\\(.)/$1/g; $_} grep {length} $finish =~ /([^|\\]+|\\.)+/gs; push @finishes => ($finishes [-1]) x (@starts - @finishes); my @re; local $" = "|"; foreach my $begin (@starts) { my $end = shift @finishes; my $qb = quotemeta $begin; my $qe = quotemeta $end; my $fb = quotemeta substr $begin => 0, 1; my $fe = quotemeta substr $end => 0, 1; my $tb = quotemeta substr $begin => 1; my $te = quotemeta substr $end => 1; my $add; if ($fb eq $fe) { push @re => qq /(?:$qb(?:(?>[^$fb]+)|$fb(?!$tb)(?!$te)|(?-1))*$qe)/; } else { my @clauses = "(?>[^$fb$fe]+)"; push @clauses => "$fb(?!$tb)" if length $tb; push @clauses => "$fe(?!$te)" if length $te; push @clauses => "(?-1)"; push @re => qq /(?:$qb(?:@clauses)*$qe)/; } } $cache {$start} {$finish} = qr /(@re)/; } pattern name => [qw /balanced -parens=() -begin= -end=/], create => sub { my $flag = $_[1]; unless (defined $flag -> {-begin} && length $flag -> {-begin} && defined $flag -> {-end} && length $flag -> {-end}) { my @open = grep {index ($flag->{-parens}, $_) >= 0} ('[','(','{','<'); my @close = map {$closer {$_}} @open; $flag -> {-begin} = join "|" => @open; $flag -> {-end} = join "|" => @close; } return nested @$flag {qw /-begin -end/}; }, version => 5.010, ; } 1; __END__ =pod =head1 NAME Regexp::Common::balanced -- provide regexes for strings with balanced parenthesized delimiters or arbitrary delimiters. =head1 SYNOPSIS use Regexp::Common qw /balanced/; while (<>) { /$RE{balanced}{-parens=>'()'}/ and print q{balanced parentheses\n}; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. =head2 C<$RE{balanced}{-parens}> Returns a pattern that matches a string that starts with the nominated opening parenthesis or bracket, contains characters and properly nested parenthesized subsequences, and ends in the matching parenthesis. More than one type of parenthesis can be specified: $RE{balanced}{-parens=>'(){}'} in which case all specified parenthesis types must be correctly balanced within the string. Since version 2013030901, C<< $1 >> will always be set (to the entire matched substring), regardless whether C<< {-keep} >> is used or not. =head2 C<< $RE{balanced}{-begin => "begin"}{-end => "end"} >> Returns a pattern that matches a string that is properly balanced using the I and I strings as start and end delimiters. Multiple sets of begin and end strings can be given by separating them by C<|>s (which can be escaped with a backslash). qr/$RE{balanced}{-begin => "do|if|case"}{-end => "done|fi|esac"}/ will match properly balanced strings that either start with I and end with I, start with I and end with I, or start with I and end with I. If I<-end> contains less cases than I<-begin>, the last case of I<-end> is repeated. If it contains more cases than I<-begin>, the extra cases are ignored. If either of I<-begin> or I<-end> isn't given, or is empty, I<< -begin => '(' >> and I<< -end => ')' >> are assumed. Since version 2013030901, C<< $1 >> will always be set (to the entire matched substring), regardless whether C<< {-keep} >> is used or not. =head2 Note Since version 2013030901 the pattern will make of the recursive construct C<< (?-1) >>, instead of using the problematic C<< (??{ }) >> construct. This fixes an problem that was introduced in the 5.17 development track. This also means the pattern is no longer available for Perls older than 5.010. =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2013, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/CC.pm000644 000765 000024 00000006046 12654447171 021421 0ustar00abigailstaff000000 000000 package Regexp::Common::CC; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::_support qw /luhn/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my @cards = ( # Name Prefix Length mod 10 [Mastercard => '5[1-5]', 16, 1], [Visa => '4', [13, 16], 1], [Amex => '3[47]', 15, 1], # Carte Blanche ['Diners Club' => '3(?:0[0-5]|[68])', 14, 1], [Discover => '6011', 16, 1], [enRoute => '2(?:014|149)', 15, 0], [JCB => [['3', 16, 1], ['2131|1800', 15, 1]]], ); foreach my $card (@cards) { my ($name, $prefix, $length, $mod) = @$card; # Skip the harder ones for now. next if ref $prefix || ref $length; next unless $mod; my $times = $length + $mod; pattern name => [CC => $name], version => 5.006, create => sub { use re 'eval'; qr <((?=($prefix))[0-9]{$length}) (?(?{Regexp::Common::_support::luhn $1})|(?!))>x } ; } 1; __END__ =pod =head1 NAME Regexp::Common::CC -- provide patterns for credit card numbers. =head1 SYNOPSIS use Regexp::Common qw /CC/; while (<>) { /^$RE{CC}{Mastercard}$/ and print "Mastercard card number\n"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. This module offers patterns for credit card numbers of several major credit card types. Currently, the supported cards are: I, I, I, and I. =head1 SEE ALSO L for a general description of how to use this interface. =over 4 =item L Credit Card Validation - Check Digits =item L Everything you ever wanted to know about CC's =item L Luhn formula =back =head1 AUTHORS Damian Conway S<(I)> and Abigail S<(I)>. =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. Send them in to S>. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/comment.pm000644 000765 000024 00000075527 12654447171 022610 0ustar00abigailstaff000000 000000 package Regexp::Common::comment; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my @generic = ( {languages => [qw /ABC Forth/], to_eol => ['\\\\']}, # This is for just a *single* backslash. {languages => [qw /Ada Alan Eiffel lua/], to_eol => ['--']}, {languages => [qw /Advisor/], to_eol => ['#|//']}, {languages => [qw /Advsys CQL Lisp LOGO M MUMPS REBOL Scheme SMITH zonefile/], to_eol => [';']}, {languages => ['Algol 60'], from_to => [[qw /comment ;/]]}, {languages => [qw {ALPACA B C C-- LPC PL/I}], from_to => [[qw {/* */}]]}, {languages => [qw /awk fvwm2 Icon m4 mutt Perl Python QML R Ruby shell Tcl/], to_eol => ['#']}, {languages => [[BASIC => 'mvEnterprise']], to_eol => ['[*!]|REM']}, {languages => [qw /Befunge-98 Funge-98 Shelta/], id => [';']}, {languages => ['beta-Juliet', 'Crystal Report', 'Portia', 'Ubercode'], to_eol => ['//']}, {languages => ['BML'], from_to => [['']], }, {languages => [qw /C++/, 'C#', qw /Cg ECMAScript FPL Java JavaScript/], to_eol => ['//'], from_to => [[qw {/* */}]]}, {languages => [qw /CLU LaTeX slrn TeX/], to_eol => ['%']}, {languages => [qw /False/], from_to => [[qw !{ }!]]}, {languages => [qw /Fortran/], to_eol => ['!']}, {languages => [qw /Haifu/], id => [',']}, {languages => [qw /ILLGOL/], to_eol => ['NB']}, {languages => [qw /INTERCAL/], to_eol => [q{(?:(?:PLEASE(?:\s+DO)?|DO)\s+)?(?:NOT|N'T)}]}, {languages => [qw /J/], to_eol => ['NB[.]']}, {languages => [qw /JavaDoc/], from_to => [[qw {/** */}]]}, {languages => [qw /Nickle/], to_eol => ['#'], from_to => [[qw {/* */}]]}, {languages => [qw /Oberon/], from_to => [[qw /(* *)/]]}, {languages => [[qw /Pascal Delphi/], [qw /Pascal Free/], [qw /Pascal GPC/]], to_eol => ['//'], from_to => [[qw !{ }!], [qw !(* *)!]]}, {languages => [[qw /Pascal Workshop/]], id => [qw /"/], from_to => [[qw !{ }!], [qw !(* *)!], [qw !/* */!]]}, {languages => [qw /PEARL/], to_eol => ['!'], from_to => [[qw {/* */}]]}, {languages => [qw /PHP/], to_eol => ['#', '//'], from_to => [[qw {/* */}]]}, {languages => [qw !PL/B!], to_eol => ['[.;]']}, {languages => [qw !PL/SQL!], to_eol => ['--'], from_to => [[qw {/* */}]]}, {languages => [qw /Q-BAL/], to_eol => ['`']}, {languages => [qw /Smalltalk/], id => ['"']}, {languages => [qw /SQL/], to_eol => ['-{2,}']}, {languages => [qw /troff/], to_eol => ['\\\"']}, {languages => [qw /vi/], to_eol => ['"']}, {languages => [qw /*W/], from_to => [[qw {|| !!}]]}, {languages => [qw /ZZT-OOP/], to_eol => ["'"]}, ); my @plain_or_nested = ( [Caml => undef, "(*" => "*)"], [Dylan => "//", "/*" => "*/"], [Haskell => "-{2,}", "{-" => "-}"], [Hugo => "!(?!\\\\)", "!\\" => "\\!"], [SLIDE => "#", "(*" => "*)"], ['Modula-2' => undef, "(*" => "*)"], ['Modula-3' => undef, "(*" => "*)"], ); # # Helper subs. # sub combine { local $_ = join "|", @_; if (@_ > 1) { s/\(\?k:/(?:/g; $_ = "(?k:$_)"; } $_ } sub to_eol ($) {"(?k:(?k:$_[0])(?k:[^\\n]*)(?k:\\n))"} sub id ($) {"(?k:(?k:$_[0])(?k:[^$_[0]]*)(?k:$_[0]))"} # One char only! sub from_to { my ($begin, $end) = @_; my $qb = quotemeta $begin; my $qe = quotemeta $end; my $fe = quotemeta substr $end => 0, 1; my $te = quotemeta substr $end => 1; "(?k:(?k:$qb)(?k:(?:[^$fe]+|$fe(?!$te))*)(?k:$qe))"; } my $count = 0; sub nested { my ($begin, $end) = @_; $count ++; my $r = '(??{$Regexp::Common::comment ['. $count . ']})'; my $qb = quotemeta $begin; my $qe = quotemeta $end; my $fb = quotemeta substr $begin => 0, 1; my $fe = quotemeta substr $end => 0, 1; my $tb = quotemeta substr $begin => 1; my $te = quotemeta substr $end => 1; use re 'eval'; my $re; if ($fb eq $fe) { $re = qr /(?:$qb(?:(?>[^$fb]+)|$fb(?!$tb)(?!$te)|$r)*$qe)/; } else { local $" = "|"; my @clauses = "(?>[^$fb$fe]+)"; push @clauses => "$fb(?!$tb)" if length $tb; push @clauses => "$fe(?!$te)" if length $te; push @clauses => $r; $re = qr /(?:$qb(?:@clauses)*$qe)/; } $Regexp::Common::comment [$count] = qr/$re/; } # # Process data. # foreach my $info (@plain_or_nested) { my ($language, $mark, $begin, $end) = @$info; pattern name => [comment => $language], create => sub {my $re = nested $begin => $end; my $prefix = defined $mark ? $mark . "[^\n]*\n|" : ""; exists $_ [1] -> {-keep} ? qr /($prefix$re)/ : qr /$prefix$re/ }, version => 5.006, ; } foreach my $group (@generic) { my $pattern = combine +(map {to_eol $_} @{$group -> {to_eol}}), (map {from_to @$_} @{$group -> {from_to}}), (map {id $_} @{$group -> {id}}), ; foreach my $language (@{$group -> {languages}}) { pattern name => [comment => ref $language ? @$language : $language], create => $pattern, ; } } # # Other languages. # # http://www.pascal-central.com/docs/iso10206.txt pattern name => [qw /comment Pascal/], create => '(?k:' . '(?k:[{]|[(][*])' . '(?k:[^}*]*(?:[*](?![)])[^}*]*)*)' . '(?k:[}]|[*][)])' . ')' ; # http://www.templetons.com/brad/alice/language/ pattern name => [qw /comment Pascal Alice/], create => '(?k:(?k:[{])(?k:[^}\n]*)(?k:[}]))' ; # http://westein.arb-phys.uni-dortmund.de/~wb/a68s.txt pattern name => [qw (comment), 'Algol 68'], create => q {(?k:(?:#[^#]*#)|} . q {(?:\bco\b(?:[^c]+|\Bc|\bc(?!o\b))*\bco\b)|} . q {(?:\bcomment\b(?:[^c]+|\Bc|\bc(?!omment\b))*\bcomment\b))} ; # See rules 91 and 92 of ISO 8879 (SGML). # Charles F. Goldfarb: "The SGML Handbook". # Oxford: Oxford University Press. 1990. ISBN 0-19-853737-9. # Ch. 10.3, pp 390. pattern name => [qw (comment HTML)], create => q {(?k:(?k:))}, ; pattern name => [qw /comment SQL MySQL/], create => q {(?k:(?:#|-- )[^\n]*\n|} . q {/\*(?:(?>[^*;"']+)|"[^"]*"|'[^']*'|\*(?!/))*(?:;|\*/))}, ; # Anything that isn't <>[]+-., # http://home.wxs.nl/~faase009/Ha_BF.html pattern name => [qw /comment Brainfuck/], create => '(?k:[^<>\[\]+\-.,]+)' ; # Squeak is a variant of Smalltalk-80. # http://www.squeak. # http://mucow.com/squeak-qref.html pattern name => [qw /comment Squeak/], create => '(?k:(?k:")(?k:[^"]*(?:""[^"]*)*)(?k:"))' ; # # Scores of less than 5 or above 17.... # http://www.cliff.biffle.org/esoterica/beatnik.html @Regexp::Common::comment::scores = (1, 3, 3, 2, 1, 4, 2, 4, 1, 8, 5, 1, 3, 1, 1, 3, 10, 1, 1, 1, 1, 4, 4, 8, 4, 10); { my ($s, $x); pattern name => [qw /comment Beatnik/], create => sub { use re 'eval'; my $re = qr {\b([A-Za-z]+)\b (?(?{($s, $x) = (0, lc $^N); $s += $Regexp::Common::comment::scores [ord (chop $x) - ord ('a')] while length $x; $s >= 5 && $s < 18})XXX|)}x; $re; }, version => 5.008, ; } # http://www.cray.com/craydoc/manuals/007-3692-005/html-007-3692-005/ # (Goto table of contents/3.3 Source Form) # Fortran, in fixed format. Comments start with a C, c or * in the first # column, or a ! anywhere, but the sixth column. Then end with a newline. pattern name => [qw /comment Fortran fixed/], create => '(?k:(?k:(?:^[Cc*]|(? [qw /comment COBOL/], create => '(?<=^......)(?k:(?k:[*])(?k:[^\n]*)(?k:\n))', version => '5.008', ; 1; __END__ =pod =head1 NAME Regexp::Common::comment -- provide regexes for comments. =head1 SYNOPSIS use Regexp::Common qw /comment/; while (<>) { /$RE{comment}{C}/ and print "Contains a C comment\n"; /$RE{comment}{C++}/ and print "Contains a C++ comment\n"; /$RE{comment}{PHP}/ and print "Contains a PHP comment\n"; /$RE{comment}{Java}/ and print "Contains a Java comment\n"; /$RE{comment}{Perl}/ and print "Contains a Perl comment\n"; /$RE{comment}{awk}/ and print "Contains an awk comment\n"; /$RE{comment}{HTML}/ and print "Contains an HTML comment\n"; } use Regexp::Common qw /comment RE_comment_HTML/; while (<>) { $_ =~ RE_comment_HTML() and print "Contains an HTML comment\n"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. This modules gives you regular expressions for comments in various languages. =head2 THE LANGUAGES Below, the comments of each of the languages are described. The patterns are available as C<$RE{comment}{I}>, foreach language I. Some languages have variants; it's described at the individual languages how to get the patterns for the variants. Unless mentioned otherwise, C<{-keep}> sets C<$1>, C<$2>, C<$3> and C<$4> to the entire comment, the opening marker, the content of the comment, and the closing marker (for many languages, the latter is a newline) respectively. =over 4 =item ABC Comments in I start with a backslash (C<\>), and last till the end of the line. See L. =item Ada Comments in I start with C<-->, and last till the end of the line. =item Advisor I is a language used by the HP product I. Comments for this language start with either C<#> or C, and last till the end of the line. =item Advsys Comments for the I language start with C<;> and last till the end of the line. See also L. =item Alan I comments start with C<-->, and last till the end of the line. See also L. =item Algol 60 Comments in the I language start with the keyword C, and end with a C<;>. See L. =item Algol 68 In I, comments are either delimited by C<#>, or by one of the keywords C or C. The keywords should not be part of another word. See L. With C<{-keep}>, only C<$1> will be set, returning the entire comment. =item ALPACA The I language has comments starting with C and ending with C<*/>. =item awk The I programming language uses comments that start with C<#> and end at the end of the line. =item B The I language has comments starting with C and ending with C<*/>. =item BASIC There are various forms of BASIC around. Currently, we only support the variant supported by I, whose pattern is available as C<$RE{comment}{BASIC}{mvEnterprise}>. Comments in this language start with a C, a C<*> or the keyword C, and end till the end of the line. See L. =item Beatnik The esotoric language I only uses words consisting of letters. Words are scored according to the rules of Scrabble. Words scoring less than 5 points, or 18 points or more are considered comments (although the compiler might mock at you if you score less than 5 points). Regardless whether C<{-keep}>, C<$1> will be set, and set to the entire comment. This pattern requires I or newer. =item beta-Juliet The I programming language has comments that start with C and that continue till the end of the line. See also L. =item Befunge-98 The esotoric language I uses comments that start and end with a C<;>. See L. =item BML I, or I is an HTML templating language that uses comments starting with C<< >, and ending with C<< c_?> >>. See L. =item Brainfuck The minimal language I uses only eight characters, C>, C>, C<[>, C<]>, C<+>, C<->, C<.> and C<,>. Any other characters are considered comments. With C<{-keep}>, C<$1> is set to the entire comment. =item C The I language has comments starting with C and ending with C<*/>. =item C-- The I language has comments starting with C and ending with C<*/>. See L. =item C++ The I language has two forms of comments. Comments that start with C and last till the end of the line, and comments that start with C, and end with C<*/>. If C<{-keep}> is used, only C<$1> will be set, and set to the entire comment. =item C# The I language has two forms of comments. Comments that start with C and last till the end of the line, and comments that start with C, and end with C<*/>. If C<{-keep}> is used, only C<$1> will be set, and set to the entire comment. See L. =item Caml Comments in I start with C<(*>, end with C<*)>, and can be nested. See L and L. =item Cg The I language has two forms of comments. Comments that start with C and last till the end of the line, and comments that start with C, and end with C<*/>. If C<{-keep}> is used, only C<$1> will be set, and set to the entire comment. See L. =item CLU In C, a comment starts with a procent sign (C<%>), and ends with the next newline. See L and L. =item COBOL Traditionally, comments in I are indicated by an asteriks in the seventh column. This is what the pattern matches. Modern compiler may more lenient though. See L, and L. Due to a bug in the regexp engine of perl 5.6.x, this regexp is only available in version 5.8.0 and up. =item CQL Comments in the chess query language (I) start with a semi colon (C<;>) and last till the end of the line. See L. =item Crystal Report The formula editor in I uses comments that start with C, and end with the end of the line. =item Dylan There are two types of comments in I. They either start with C, or are nested comments, delimited with C and C<*/>. Under C<{-keep}>, only C<$1> will be set, returning the entire comment. This pattern requires I or newer. =item ECMAScript The I language has two forms of comments. Comments that start with C and last till the end of the line, and comments that start with C, and end with C<*/>. If C<{-keep}> is used, only C<$1> will be set, and set to the entire comment. I is Netscapes implementation of I. See L, and L. =item Eiffel I comments start with C<-->, and last till the end of the line. =item False In I, comments start with C<{> and end with C<}>. See L =item FPL The I language has two forms of comments. Comments that start with C and last till the end of the line, and comments that start with C, and end with C<*/>. If C<{-keep}> is used, only C<$1> will be set, and set to the entire comment. =item Forth Comments in Forth start with C<\>, and end with the end of the line. See also L. =item Fortran There are two forms of I. There's free form I, which has comments that start with C, and end at the end of the line. The pattern for this is given by C<$RE{Fortran}>. Fixed form I, which has been obsoleted, has comments that start with C, C or C<*> in the first column, or with C anywhere, but the sixth column. The pattern for this are given by C<$RE{Fortran}{fixed}>. See also L. =item Funge-98 The esotoric language I uses comments that start and end with a C<;>. =item fvwm2 Configuration files for I have comments starting with a C<#> and lasting the rest of the line. =item Haifu I, an esotoric language using haikus, has comments starting and ending with a C<,>. See L. =item Haskell There are two types of comments in I. They either start with at least two dashes, or are nested comments, delimited with C<{-> and C<-}>. Under C<{-keep}>, only C<$1> will be set, returning the entire comment. This pattern requires I or newer. =item HTML In I, comments only appear inside a I. A comment declaration starts with a C!>, and ends with a C>. Inside this declaration, we have zero or more comments. Comments starts with C<--> and end with C<-->, and are optionally followed by whitespace. The pattern C<$RE{comment}{HTML}> recognizes those comment declarations (and hence more than a comment). Note that this is not the same as something that starts with C!--> and ends with C<--E>, because the following will be matched completely: Second Comment Do not be fooled by what your favourite browser thinks is an HTML comment. If C<{-keep}> is used, the following are returned: =over 4 =item $1 captures the entire comment declaration. =item $2 captures the MDO (markup declaration open), C!>. =item $3 captures the content between the MDO and the MDC. =item $4 captures the (last) comment, without the surrounding dashes. =item $5 captures the MDC (markup declaration close), C>. =back =item Hugo There are two types of comments in I. They either start with C (which cannot be followed by a C<\>), or are nested comments, delimited with C and C<\!>. Under C<{-keep}>, only C<$1> will be set, returning the entire comment. This pattern requires I or newer. =item Icon I has comments that start with C<#> and end at the next new line. See L, L, and L. =item ILLGOL The esotoric language I uses comments starting with I and lasting till the end of the line. See L. =item INTERCAL Comments in INTERCAL are single line comments. They start with one of the keywords C or C, and can optionally be preceded by the keywords C and C. If both keywords are used, C precedes C. Keywords are separated by whitespace. =item J The language I uses comments that start with C, and that last till the end of the line. See L, and L. =item Java The I language has two forms of comments. Comments that start with C and last till the end of the line, and comments that start with C, and end with C<*/>. If C<{-keep}> is used, only C<$1> will be set, and set to the entire comment. =item JavaDoc The I documentation syntax is demarked with a subset of ordinary Java comments to separate it from code. Comments start with C end with C<*/>. If C<{-keep}> is used, only C<$1> will be set, and set to the entire comment. See L. =item JavaScript The I language has two forms of comments. Comments that start with C and last till the end of the line, and comments that start with C, and end with C<*/>. If C<{-keep}> is used, only C<$1> will be set, and set to the entire comment. I is Netscapes implementation of I. See L, and L. =item LaTeX The documentation language I uses comments starting with C<%> and ending at the end of the line. =item Lisp Comments in I start with a semi-colon (C<;>) and last till the end of the line. =item LPC The I language has comments starting with C and ending with C<*/>. =item LOGO Comments for the language I start with C<;>, and last till the end of the line. =item lua Comments for the I language start with C<-->, and last till the end of the line. See also L. =item M, MUMPS In C (aka C), comments start with a semi-colon, and last till the end of a line. The language specification requires the semi-colon to be preceded by one or more Is. Those characters default to a space, but that's configurable. This requirement, of preceding the comment with linestart characters is B tested for. See L, L, and L. =item m4 By default, the preprocessor language I uses single line comments, that start with a C<#> and continue to the end of the line, including the newline. The pattern C<$RE {comment} {m4}> matches such comments. In I, it is possible to change the starting token though. See L, L, and L. =item Modula-2 In C, comments start with C<(*>, and end with C<*)>. Comments may be nested. See L. =item Modula-3 In C, comments start with C<(*>, and end with C<*)>. Comments may be nested. See L. =item mutt Configuration files for I have comments starting with a C<#> and lasting the rest of the line. =item Nickle The I language has one line comments starting with C<#> (like Perl), or multiline comments delimited by C and C<*/> (like C). Under C<-keep>, only C<$1> will be set. See also L. =item Oberon Comments in I start with C<(*> and end with C<*)>. See L. =item Pascal There are many implementations of Pascal. This modules provides pattern for comments of several implementations. =over 4 =item C<$RE{comment}{Pascal}> This is the pattern that recognizes comments according to the Pascal ISO standard. This standard says that comments start with either C<{>, or C<(*>, and end with C<}> or C<*)>. This means that C<{*)> and C<(*}> are considered to be comments. Many Pascal applications don't allow this. See L =item C<$RE{comment}{Pascal}{Alice}> The I compiler accepts comments that start with C<{> and end with C<}>. Comments are not allowed to contain newlines. See L. =item C<$RE{comment}{Pascal}{Delphi}>, C<$RE{comment}{Pascal}{Free}> and C<$RE{comment}{Pascal}{GPC}> The I, I and the I implementations of Pascal all have comments that either start with C and last till the end of the line, are delimited with C<{> and C<}> or are delimited with C<(*> and C<*)>. Patterns for those comments are given by C<$RE{comment}{Pascal}{Delphi}>, C<$RE{comment}{Pascal}{Free}> and C<$RE{comment}{Pascal}{GPC}> respectively. These patterns only set C<$1> when C<{-keep}> is used, which will then include the entire comment. See L, L and L. =item C<$RE{comment}{Pascal}{Workshop}> The I compiler, from SUN Microsystems, allows comments that are delimited with either C<{> and C<}>, delimited with C<(*)> and C<*>), delimited with C, and C<*/>, or starting and ending with a double quote (C<">). When C<{-keep}> is used, only C<$1> is set, and returns the entire comment. See L. =back =item PEARL Comments in I start with a C and last till the end of the line, or start with C and end with C<*/>. With C<{-keep}>, C<$1> will be set to the entire comment. =item PHP Comments in I start with either C<#> or C and last till the end of the line, or are delimited by C and C<*/>. With C<{-keep}>, C<$1> will be set to the entire comment. =item PL/B In I, comments start with either C<.> or C<;>, and end with the next newline. See L. =item PL/I The I language has comments starting with C and ending with C<*/>. =item PL/SQL In I, comments either start with C<--> and run till the end of the line, or start with C and end with C<*/>. =item Perl I uses comments that start with a C<#>, and continue till the end of the line. =item Portia The I programming language has comments that start with C, and last till the end of the line. =item Python I uses comments that start with a C<#>, and continue till the end of the line. =item Q-BAL Comments in the I language start with C<`> (a backtick), and contine till the end of the line. =item QML In C, comments start with C<#> and last till the end of the line. See L. =item R The statistical language I uses comments that start with a C<#> and end with the following new line. See L. =item REBOL Comments for the I language start with C<;> and last till the end of the line. =item Ruby Comments in I start with C<#> and last till the end of the time. =item Scheme I comments start with C<;>, and last till the end of the line. See L. =item shell Comments in various Is start with a C<#> and end at the end of the line. =item Shelta The esotoric language I uses comments that start and end with a C<;>. See L. =item SLIDE The I language has two froms of comments. First there is the line comment, which starts with a C<#> and includes the rest of the line (just like Perl). Second, there is the multiline, nested comment, which are delimited by C<(*> and C<*)>. Under C{-keep}>, only C<$1> is set, and is set to the entire comment. This pattern needs at least Perl version 5.6.0. See L. =item slrn Configuration files for I have comments starting with a C<%> and lasting the rest of the line. =item Smalltalk I uses comments that start and end with a double quote, C<">. =item SMITH Comments in the I language start with C<;>, and last till the end of the line. =item Squeak In the Smalltalk variant I, comments start and end with C<">. Double quotes can appear inside comments by doubling them. =item SQL Standard I uses comments starting with two or more dashes, and ending at the end of the line. I does not follow the standard. Instead, it allows comments that start with a C<#> or C<-- > (that's two dashes and a space) ending with the following newline, and comments starting with C, and ending with the next C<;> or C<*/> that isn't inside single or double quotes. A pattern for this is returned by C<$RE{comment}{SQL}{MySQL}>. With C<{-keep}>, only C<$1> will be set, and it returns the entire comment. =item Tcl In I, comments start with C<#> and continue till the end of the line. =item TeX The documentation language I uses comments starting with C<%> and ending at the end of the line. =item troff The document formatting language I uses comments starting with C<\">, and continuing till the end of the line. =item Ubercode The Windows programming language I uses comments that start with C and continue to the end of the line. See L. =item vi In configuration files for the editor I, one can use comments starting with C<">, and ending at the end of the line. =item *W In the language I<*W>, comments start with C<||>, and end with C. =item zonefile Comments in DNS Is start with C<;>, and continue till the end of the line. =item ZZT-OOP The in-game language I uses comments that start with a C<'> character, and end at the following newline. See L. =back =head1 REFERENCES =over 4 =item B<[Go 90]> Charles F. Goldfarb: I. Oxford: Oxford University Press. B<1990>. ISBN 0-19-853737-9. Ch. 10.3, pp 390-391. =back =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/delimited.pm000644 000765 000024 00000010105 12654447171 023063 0ustar00abigailstaff000000 000000 package Regexp::Common::delimited; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; sub gen_delimited { my ($dels, $escs) = @_; # return '(?:\S*)' unless $dels =~ /\S/; if (length $escs) { $escs .= substr ($escs, -1) x (length ($dels) - length ($escs)); } my @pat = (); my $i; for ($i=0; $i < length $dels; $i++) { my $del = quotemeta substr ($dels, $i, 1); my $esc = length($escs) ? quotemeta substr ($escs, $i, 1) : ""; if ($del eq $esc) { push @pat, "(?k:$del)(?k:[^$del]*(?:(?:$del$del)[^$del]*)*)(?k:$del)"; } elsif (length $esc) { push @pat, "(?k:$del)(?k:[^$esc$del]*(?:$esc.[^$esc$del]*)*)(?k:$del)"; } else { push @pat, "(?k:$del)(?k:[^$del]*)(?k:$del)"; } } my $pat = join '|', @pat; return "(?k:$pat)"; } sub _croak { require Carp; goto &Carp::croak; } pattern name => [qw( delimited -delim= -esc=\\ )], create => sub {my $flags = $_[1]; _croak 'Must specify delimiter in $RE{delimited}' unless length $flags->{-delim}; return gen_delimited (@{$flags}{-delim, -esc}); }, ; pattern name => [qw( quoted -esc=\\ )], create => sub {my $flags = $_[1]; return gen_delimited (q{"'`}, $flags -> {-esc}); }, ; 1; __END__ =pod =head1 NAME Regexp::Common::delimited -- provides a regex for delimited strings =head1 SYNOPSIS use Regexp::Common qw /delimited/; while (<>) { /$RE{delimited}{-delim=>'"'}/ and print 'a \" delimited string'; /$RE{delimited}{-delim=>'/'}/ and print 'a \/ delimited string'; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. =head2 C<$RE{delimited}{-delim}{-esc}> Returns a pattern that matches a single-character-delimited substring, with optional internal escaping of the delimiter. When C<-delim=I> is specified, each character in the sequence I is a possible delimiter. There is no default delimiter, so this flag must always be specified. If C<-esc=I> is specified, each character in the sequence I is the delimiter for the corresponding character in the C<-delim=I> list. The default escape is backslash. For example: $RE{delimited}{-delim=>'"'} # match "a \" delimited string" $RE{delimited}{-delim=>'"'}{-esc=>'"'} # match "a "" delimited string" $RE{delimited}{-delim=>'/'} # match /a \/ delimited string/ $RE{delimited}{-delim=>q{'"}} # match "string" or 'string' Under C<-keep> (See L): =over 4 =item $1 captures the entire match =item $2 captures the opening delimiter (provided only one delimiter was specified) =item $3 captures delimited portion of the string (provided only one delimiter was specified) =item $4 captures the closing delimiter (provided only one delimiter was specified) =back =head2 $RE{quoted}{-esc} A synonym for C<$RE{delimited}{q{-delim='"`}{...}}> =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/lingua.pm000644 000765 000024 00000004627 12654447171 022416 0ustar00abigailstaff000000 000000 package Regexp::Common::lingua; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; pattern name => [qw /lingua palindrome -chars=[A-Za-z]/], create => sub { use re 'eval'; my $keep = exists $_ [1] -> {-keep}; my $ch = $_ [1] -> {-chars}; my $idx = $keep ? "1:$ch" : "0:$ch"; my $r = "(??{\$Regexp::Common::lingua::pd{'" . $idx . "'}})"; $Regexp::Common::lingua::pd {$idx} = $keep ? qr /($ch|($ch)($r)?\2)/ : qr /$ch|($ch)($r)?\1/; # print "[$ch]: ", $Regexp::Common::lingua::pd {$idx}, "\n"; # $Regexp::Common::lingua::pd {$idx}; }, version => 5.006 ; 1; __END__ =pod =head1 NAME Regexp::Common::lingua -- provide regexes for language related stuff. =head1 SYNOPSIS use Regexp::Common qw /lingua/; while (<>) { /^$RE{lingua}{palindrome}$/ and print "is a palindrome\n"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. =head2 C<$RE{lingua}{palindrome}> Returns a pattern that recognizes a palindrome, a string that is the same if you reverse it. By default, it only matches strings consisting of letters, but this can be changed using the C<{-chars}> option. This option takes a character class (default is C<[A-Za-z]>) as argument. If C<{-keep}> is used, only C<$1> will be set, and set to the entire match. This pattern requires at least perl 5.6.0. =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Many regexes are missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/list.pm000644 000765 000024 00000010216 12654447171 022101 0ustar00abigailstaff000000 000000 package Regexp::Common::list; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; sub gen_list_pattern { my ($pat, $sep, $lsep) = @_; $lsep = $sep unless defined $lsep; return "(?k:(?:(?:$pat)(?:$sep))*(?:$pat)(?k:$lsep)(?:$pat))"; } my $defpat = '.*?\S'; my $defsep = '\s*,\s*'; pattern name => ['list', "-pat=$defpat", "-sep=$defsep", '-lastsep'], create => sub {gen_list_pattern (@{$_[1]}{-pat, -sep, -lastsep})}, ; pattern name => ['list', 'conj', '-word=(?:and|or)'], create => sub {gen_list_pattern($defpat, $defsep, '\s*,?\s*'.$_[1]->{-word}.'\s*'); }, ; pattern name => ['list', 'and'], create => sub {gen_list_pattern ($defpat, $defsep, '\s*,?\s*and\s*')}, ; pattern name => ['list', 'or'], create => sub {gen_list_pattern ($defpat, $defsep, '\s*,?\s*or\s*')}, ; 1; __END__ =pod =head1 NAME Regexp::Common::list -- provide regexes for lists =head1 SYNOPSIS use Regexp::Common qw /list/; while (<>) { /$RE{list}{-pat => '\w+'}/ and print "List of words"; /$RE{list}{-pat => $RE{num}{real}}/ and print "List of numbers"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. =head2 C<$RE{list}{-pat}{-sep}{-lastsep}> Returns a pattern matching a list of (at least two) substrings. If C<-pat=I

> is specified, it defines the pattern for each substring in the list. By default, I

is C. In Regexp::Common 0.02 or earlier, the default pattern was C. But that will match a single space, causing unintended parsing of C as a list of four elements instead of 3 (with C<-word> being C<(?:and)>). One consequence is that a list of the form "a,,b" will no longer be parsed. Use the pattern C to be able to parse this, but see the previous remark. If C<-sep=I

> is specified, it defines the pattern I

to be used as a separator between each pair of substrings in the list, except the final two. By default I

is C. If C<-lastsep=I

> is specified, it defines the pattern I

to be used as a separator between the final two substrings in the list. By default I

is the same as the pattern specified by the C<-sep> flag. For example: $RE{list}{-pat=>'\w+'} # match a list of word chars $RE{list}{-pat=>$RE{num}{real}} # match a list of numbers $RE{list}{-sep=>"\t"} # match a tab-separated list $RE{list}{-lastsep=>',\s+and\s+'} # match a proper English list Under C<-keep>: =over 4 =item $1 captures the entire list =item $2 captures the last separator =back =head2 C<$RE{list}{conj}{-word=I}> An alias for C<< $RE{list}{-lastsep=>'\s*,?\s*I\s*'} >> If C<-word> is not specified, the default pattern is C. For example: $RE{list}{conj}{-word=>'et'} # match Jean, Paul, et Satre $RE{list}{conj}{-word=>'oder'} # match Bonn, Koln oder Hamburg =head2 C<$RE{list}{and}> An alias for C<< $RE{list}{conj}{-word=>'and'} >> =head2 C<$RE{list}{or}> An alias for C<< $RE{list}{conj}{-word=>'or'} >> =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/net.pm000644 000765 000024 00000031064 12654447171 021720 0ustar00abigailstaff000000 000000 package Regexp::Common::net; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my %IPunit = ( dec => q{(?k:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})}, oct => q{(?k:[0-3]?[0-7]{1,2})}, hex => q{(?k:[0-9a-fA-F]{1,2})}, bin => q{(?k:[0-1]{1,8})}, ); my %MACunit = ( %IPunit, hex => q{(?k:[0-9a-fA-F]{1,2})}, ); my %IPv6unit = ( hex => q {(?k:[0-9a-f]{1,4})}, HEX => q {(?k:[0-9A-F]{1,4})}, HeX => q {(?k:[0-9a-fA-F]{1,4})}, ); sub dec {$_}; sub bin {oct "0b$_"} my $IPdefsep = '[.]'; my $MACdefsep = ':'; my $IPv6defsep = ':'; pattern name => [qw (net IPv4)], create => "(?k:$IPunit{dec}$IPdefsep$IPunit{dec}$IPdefsep" . "$IPunit{dec}$IPdefsep$IPunit{dec})", ; pattern name => [qw (net MAC)], create => "(?k:" . join ($MACdefsep => ($MACunit{hex}) x 6) . ")", subs => sub { $_ [1] = join ":" => map {sprintf "%02x" => hex} split /$MACdefsep/ => $_ [1] if $_ [1] =~ /$_[0]/ }, ; foreach my $type (qw /dec oct hex bin/) { pattern name => [qw (net IPv4), $type, "-sep=$IPdefsep"], create => sub {my $sep = $_ [1] -> {-sep}; "(?k:$IPunit{$type}$sep$IPunit{$type}$sep" . "$IPunit{$type}$sep$IPunit{$type})" }, ; pattern name => [qw (net MAC), $type, "-sep=$MACdefsep"], create => sub {my $sep = $_ [1] -> {-sep}; "(?k:" . join ($sep => ($MACunit{$type}) x 6) . ")", }, subs => sub { return if $] < 5.006 and $type eq 'bin'; $_ [1] = join ":" => map {sprintf "%02x" => eval $type} $2, $3, $4, $5, $6, $7 if $_ [1] =~ $RE {net} {MAC} {$type} {-sep => $_ [0] -> {flags} {-sep}} {-keep}; }, ; } my %cache6; pattern name => [qw (net IPv6), "-sep=$IPv6defsep", "-style=HeX"], create => sub { my $style = $_ [1] {-style}; my $sep = $_ [1] {-sep}; return $cache6 {$style, $sep} if $cache6 {$style, $sep}; my @re; die "Impossible style '$style'\n" unless exists $IPv6unit {$style}; # # Nothing missing # push @re => join $sep => ($IPv6unit {$style}) x 8; # # For "double colon" representations, at least 2 units must # be omitted, leaving us with at most 6 units. 0 units is also # possible. Note we can have at most one double colon. # for (my $l = 0; $l <= 6; $l ++) { # # We prefer to do longest match, so larger $r gets priority # for (my $r = 6 - $l; $r >= 0; $r --) { # # $l is the number of blocks left of the double colon, # $r is the number of blocks left of the double colon, # $m is the number of omitted blocks # my $m = 8 - $l - $r; my $patl = $l ? ($IPv6unit {$style} . $sep) x $l : $sep; my $patr = $r ? ($sep . $IPv6unit {$style}) x $r : $sep; my $patm = "(?k:)" x $m; my $pat = $patl . $patm . $patr; push @re => "(?:$pat)"; } } local $" = "|"; $cache6 {$style, $sep} = qq /(?k:(?|@re))/; }, version => 5.010 ; my $letter = "[A-Za-z]"; my $let_dig = "[A-Za-z0-9]"; my $let_dig_hyp = "[-A-Za-z0-9]"; # Domain names, from RFC 1035. pattern name => [qw (net domain -nospace= -rfc1101=)], create => sub { my $rfc1101 = exists $_ [1] {-rfc1101} && !defined $_ [1] {-rfc1101}; my $lead = $rfc1101 ? "(?!$RE{net}{IPv4}(?:[.]|\$))$let_dig" : $letter; if (exists $_ [1] {-nospace} && !defined $_ [1] {-nospace}) { return "(?k:$lead(?:(?:$let_dig_hyp){0,61}$let_dig)?" . "(?:\\.$lead(?:(?:$let_dig_hyp){0,61}$let_dig)?)*)" } else { return "(?k: |(?:$lead(?:(?:$let_dig_hyp){0,61}$let_dig)?" . "(?:\\.$lead(?:(?:$let_dig_hyp){0,61}$let_dig)?)*))" } }, ; 1; __END__ =head1 NAME Regexp::Common::net -- provide regexes for IPv4 addresses. =head1 SYNOPSIS use Regexp::Common qw /net/; while (<>) { /$RE{net}{IPv4}/ and print "Dotted decimal IP address"; /$RE{net}{IPv4}{hex}/ and print "Dotted hexadecimal IP address"; /$RE{net}{IPv4}{oct}{-sep => ':'}/ and print "Colon separated octal IP address"; /$RE{net}{IPv4}{bin}/ and print "Dotted binary IP address"; /$RE{net}{MAC}/ and print "MAC address"; /$RE{net}{MAC}{oct}{-sep => " "}/ and print "Space separated octal MAC address"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. This modules gives you regular expressions for various style IPv4 and MAC (or ethernet) addresses. =head2 C<$RE{net}{IPv4}> Returns a pattern that matches a valid IP address in "dotted decimal". Note that while C<318.99.183.11> is not a valid IP address, it does match C, but this is because C<318.99.183.11> contains a valid IP address, namely C<18.99.183.11>. To prevent the unwanted matching, one needs to anchor the regexp: C. For this pattern and the next four, under C<-keep> (See L): =over 4 =item $1 captures the entire match =item $2 captures the first component of the address =item $3 captures the second component of the address =item $4 captures the third component of the address =item $5 captures the final component of the address =back =head2 C<$RE{net}{IPv4}{dec}{-sep}> Returns a pattern that matches a valid IP address in "dotted decimal" If C<< -sep=I

>> is specified the pattern I

is used as the separator. By default I

>> is specified the pattern I

is used as the separator. By default I

is C. C<< -sep="" >> and C<< -sep=" " >> are useful alternatives. =head2 C<$RE{net}{IPv4}{oct}{-sep}> Returns a pattern that matches a valid IP address in "dotted octal" If C<< -sep=I

>> is specified the pattern I

is used as the separator. By default I

is C. =head2 C<$RE{net}{IPv4}{bin}{-sep}> Returns a pattern that matches a valid IP address in "dotted binary" If C<< -sep=I

>> is specified the pattern I

is used as the separator. By default I

is C. =head2 C<$RE{net}{MAC}> Returns a pattern that matches a valid MAC or ethernet address as colon separated hexadecimals. For this pattern, and the next four, under C<-keep> (See L): =over 4 =item $1 captures the entire match =item $2 captures the first component of the address =item $3 captures the second component of the address =item $4 captures the third component of the address =item $5 captures the fourth component of the address =item $6 captures the fifth component of the address =item $7 captures the sixth and final component of the address =back This pattern, and the next four, have a C method as well, which will transform a matching MAC address into so called canonical format. Canonical format means that every component of the address will be exactly two hexadecimals (with a leading zero if necessary), and the components will be separated by a colon. The C method will not work for binary MAC addresses if the Perl version predates 5.6.0. =head2 C<$RE{net}{MAC}{dec}{-sep}> Returns a pattern that matches a valid MAC address as colon separated decimals. If C<< -sep=I

>> is specified the pattern I

is used as the separator. By default I

is C. =head2 C<$RE{net}{MAC}{hex}{-sep}> Returns a pattern that matches a valid MAC address as colon separated hexadecimals, with the letters C to C in lower case. If C<< -sep=I

>> is specified the pattern I

is used as the separator. By default I

is C. =head2 C<$RE{net}{MAC}{oct}{-sep}> Returns a pattern that matches a valid MAC address as colon separated octals. If C<< -sep=I

>> is specified the pattern I

is used as the separator. By default I

is C. =head2 C<$RE{net}{MAC}{bin}{-sep}> Returns a pattern that matches a valid MAC address as colon separated binary numbers. If C<< -sep=I

>> is specified the pattern I

is used as the separator. By default I

is C. =head2 C<< $RE{net}{IPv6}{-sep => ':'}{-style => 'HeX'} >> Returns a pattern matching IPv6 numbers. An IPv6 address consists of eigth groups of four hexadecimal digits, separated by colons. In each group, leading zeros may be omitted. Two or more consecutive groups consisting of only zeros may be omitted (including any colons separating them), resulting into two sets of groups, separated by a double colon. (Each of the groups may be empty; C<< :: >> is a valid address, equal to C<< 0000:0000:0000:0000:0000:0000:0000:0000 >>). The hex numbers may be in either case. If the C<< -sep >> option is used, its argument is a pattern that matches the separator that separates groups. This defaults to C<< : >>. The C<< -style >> option is used to denote which case the hex numbers may be. The default style, C<< 'HeX' >> indicates both lower case letters C<< 'a' >> to C<< 'f' >> and upper case letters C<< 'A' >> to C<< 'F' >> will be matched. The style C<< 'HEX' >> restricts matching to upper case letters, and C<< 'hex' >> only matches lower case letters. If C<< {-keep} >> is used, C<< $1 >> to C<< $9 >> will be set. C<< $1 >> will be set to the matched address, while C<< $2 >> to C<< $9 >> will be set to each matched group. If a group is omitted because it contains all zeros, its matching variable will be the empty string. Example: "2001:db8:85a3::8a2e:370:7334" =~ /$RE{net}{IPv6}{-keep}/; print $2; # '2001' print $4; # '85a3' print $6; # Empty string print $8; # '370' Perl 5.10 (or later) is required for this pattern. =head2 C<$RE{net}{domain}> Returns a pattern to match domains (and hosts) as defined in RFC 1035. Under I{-keep} only the entire domain name is returned. RFC 1035 says that a single space can be a domainname too. So, the pattern returned by C<$RE{net}{domain}> recognizes a single space as well. This is not always what people want. If you want to recognize domainnames, but not a space, you can do one of two things, either use /(?! )$RE{net}{domain}/ or use the C<{-nospace}> option (without an argument). RFC 1035 does B allow host or domain names to start with a digits; however, this restriction is relaxed in RFC 1101; this RFC allows host and domain names to start with a digit, as long as the first part of a domain does not look like an IP address. If the C<< {-rfc1101} >> option is given (as in C<< $RE {net} {domain} {-rfc1101} >>), we will match using the relaxed rules. =head1 REFERENCES =over 4 =item B Mockapetris, P.: I. November 1987. =item B Mockapetris, P.: I. April 1987. =back =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway I. =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2013, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/number.pm000644 000765 000024 00000032633 12654447171 022425 0ustar00abigailstaff000000 000000 package Regexp::Common::number; use Config; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; sub _croak { require Carp; goto &Carp::croak; } my $digits = join ("", 0 .. 9, "A" .. "Z"); sub int_creator { my $flags = $_ [1]; my ($sep, $group, $base, $places, $sign) = @{$flags} {qw /-sep -group -base -places -sign/}; # Deal with the bases. _croak "Base must be between 1 and 36" unless $base >= 1 && $base <= 36; my $chars = substr $digits, 0, $base; $sep = ',' if exists $flags -> {-sep} && !defined $flags -> {-sep}; my $max = $group; $max = $2 if $group =~ /^\s*(\d+)\s*,\s*(\d+)\s*$/; my $quant = $places ? "{$places}" : "+"; return $sep ? qq {(?k:(?k:$sign)(?k:[$chars]{1,$max}} . qq {(?:$sep} . qq {[$chars]{$group})*))} : qq {(?k:(?k:$sign)(?k:[$chars]$quant))} } sub real_creator { my ($base, $places, $radix, $sep, $group, $expon, $sign) = @{$_[1]}{-base, -places, -radix, -sep, -group, -expon, -sign}; _croak "Base must be between 1 and 36" unless $base >= 1 && $base <= 36; $sep = ',' if exists $_[1]->{-sep} && !defined $_[1]->{-sep}; if ($base > 14 && $expon =~ /^[Ee]$/) {$expon = 'G'} foreach ($radix, $sep, $expon) {$_ = "[$_]" if 1 == length} my $chars = substr $digits, 0, $base; return $sep ? qq {(?k:(?i)(?k:$sign)(?k:(?=$radix?[$chars])} . qq {(?k:[$chars]{1,$group}(?:(?:$sep)[$chars]{$group})*)} . qq {(?:(?k:$radix)(?k:[$chars]{$places}))?)} . qq {(?:(?k:$expon)(?k:(?k:$sign)(?k:[$chars]+))|))} : qq {(?k:(?i)(?k:$sign)(?k:(?=$radix?[$chars])} . qq {(?k:[$chars]*)(?:(?k:$radix)(?k:[$chars]{$places}))?)} . qq {(?:(?k:$expon)(?k:(?k:$sign)(?k:[$chars]+))|))}; } sub decimal_creator { my ($base, $places, $radix, $sep, $group, $sign) = @{$_[1]}{-base, -places, -radix, -sep, -group, -sign}; _croak "Base must be between 1 and 36" unless $base >= 1 && $base <= 36; $sep = ',' if exists $_[1]->{-sep} && !defined $_[1]->{-sep}; foreach ($radix, $sep) {$_ = "[$_]" if 1 == length} my $chars = substr $digits, 0, $base; return $sep ? qq {(?k:(?i)(?k:$sign)(?k:(?=$radix?[$chars])} . qq {(?k:[$chars]{1,$group}(?:(?:$sep)[$chars]{$group})*)} . qq {(?:(?k:$radix)(?k:[$chars]{$places}))?))} : qq {(?k:(?i)(?k:$sign)(?k:(?=$radix?[$chars])} . qq {(?k:[$chars]*)(?:(?k:$radix)(?k:[$chars]{$places}))?))} } pattern name => [qw (num int -sep= -base=10 -group=3 -sign=[-+]?)], create => \&int_creator, ; pattern name => [qw (num real -base=10), '-places=0,', qw (-radix=[.] -sep= -group=3 -expon=E -sign=[-+]?)], create => \&real_creator, ; pattern name => [qw (num decimal -base=10), '-places=0,', qw (-radix=[.] -sep= -group=3 -sign=[-+]?)], create => \&decimal_creator, ; sub real_synonym { my ($name, $base) = @_; pattern name => ['num', $name, '-places=0,', '-radix=[.]', '-sep=', '-group=3', '-expon=E', '-sign=[-+]?'], create => sub {my %flags = (%{$_[1]}, -base => $base); real_creator (undef, \%flags); } ; } real_synonym (hex => 16); real_synonym (dec => 10); real_synonym (oct => 8); real_synonym (bin => 2); # 2147483647 pattern name => [qw (num square)], create => sub { use re 'eval'; my $sixty_four_bits = $Config {use64bitint}; # # CPAN testers claim it fails on 5.8.8 and darwin 9.0. # $sixty_four_bits = 0 if $Config {osname} eq 'darwin' && $Config {osvers} eq '9.0' && $] == 5.008008; my $num = $sixty_four_bits ? '0*[1-8]?[0-9]{1,15}' : '0*(?:2(?:[0-0][0-9]{8}' . '|1(?:[0-3][0-9]{7}' . '|4(?:[0-6][0-9]{6}' . '|7(?:[0-3][0-9]{5}' . '|4(?:[0-7][0-9]{4}' . '|8(?:[0-2][0-9]{3}' . '|3(?:[0-5][0-9]{2}' . '|6(?:[0-3][0-9]{1}' . '|4[0-7])))))))))|1?[0-9]{1,9}'; qr {($num)(?(?{sqrt ($^N) == int sqrt ($^N)})|(?!))} }, version => 5.008; ; pattern name => [qw (num roman)], create => '(?xi)(?=[MDCLXVI]) (?k:M{0,3} (D?C{0,3}|CD|CM)? (L?X{0,3}|XL|XC)? (V?I{0,3}|IV|IX)?)' ; 1; __END__ =pod =head1 NAME Regexp::Common::number -- provide regexes for numbers =head1 SYNOPSIS use Regexp::Common qw /number/; while (<>) { /^$RE{num}{int}$/ and print "Integer\n"; /^$RE{num}{real}$/ and print "Real\n"; /^$RE{num}{real}{-base => 16}$/ and print "Hexadecimal real\n"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. =head2 C<$RE{num}{int}{-base}{-sep}{-group}{-places}{-sign}> Returns a pattern that matches an integer. If C<< -base => I >> is specified, the integer is in base I, with C<< 2 <= I <= 36 >>. For bases larger than 10, upper case letters are used. The default base is 10. If C<< -sep => I

>> is specified, the pattern I

is required as a grouping marker within the number. If this option is not given, no grouping marker is used. If C<< -group => I >> is specified, digits between grouping markers must be grouped in sequences of exactly I digits. The default value of I is 3. If C<< -group => I >> is specified, digits between grouping markers must be grouped in sequences of at least I digits, and at most I digits. This option is ignored unless the C<< -sep >> option is used. If C<< -places => I >> is specified, the integer recognized must be exactly I digits wide. If C<< -places => I >> is specified, the integer must be at least I wide, and at most I characters. There is no default, which means that integers are unlimited in size. This option is ignored if the C<< -sep >> option is used. If C<< -sign => I

>> is used, it's a pattern the leading sign has to match. This defaults to C<< [-+]? >>, which means the number is optionally preceded by a minus or a plus. If you want to match unsigned integers, use C<< $RE{num}{int}{-sign => ''} >>. For example: $RE{num}{int} # match 1234567 $RE{num}{int}{-sep=>','} # match 1,234,567 $RE{num}{int}{-sep=>',?'} # match 1234567 or 1,234,567 $RE{num}{int}{-sep=>'.'}{-group=>4} # match 1.2345.6789 Under C<-keep> (see L): =over 4 =item $1 captures the entire number =item $2 captures the optional sign of the number =item $3 captures the complete set of digits =back =head2 C<$RE{num}{real}{-base}{-radix}{-places}{-sep}{-group}{-expon}> Returns a pattern that matches a floating-point number. If C<-base=I> is specified, the number is assumed to be in that base (with A..Z representing the digits for 11..36). By default, the base is 10. If C<-radix=I

> is specified, the pattern I

is used as the radix point for the number (i.e. the "decimal point" in base 10). The default is C. If C<-places=I> is specified, the number is assumed to have exactly I places after the radix point. If C<-places=I> is specified, the number is assumed to have between I and I places after the radix point. By default, the number of places is unrestricted. If C<-sep=I

> specified, the pattern I

is required as a grouping marker within the pre-radix section of the number. By default, no separator is allowed. If C<-group=I> is specified, digits between grouping separators must be grouped in sequences of exactly I characters. The default value of I is 3. If C<-expon=I

> is specified, the pattern I

is used as the exponential marker. The default value of I

is C. If C<-sign=I

> is specified, the pattern I

is used to match the leading sign (and the sign of the exponent). This defaults to C<< [-+]? >>, means means that an optional plus or minus sign can be used. For example: $RE{num}{real} # matches 123.456 or -0.1234567 $RE{num}{real}{-places=>2} # matches 123.45 or -0.12 $RE{num}{real}{-places=>'0,3'} # matches 123.456 or 0 or 9.8 $RE{num}{real}{-sep=>'[,.]?'} # matches 123,456 or 123.456 $RE{num}{real}{-base=>3'} # matches 121.102 Under C<-keep>: =over 4 =item $1 captures the entire match =item $2 captures the optional sign of the number =item $3 captures the complete mantissa =item $4 captures the whole number portion of the mantissa =item $5 captures the radix point =item $6 captures the fractional portion of the mantissa =item $7 captures the optional exponent marker =item $8 captures the entire exponent value =item $9 captures the optional sign of the exponent =item $10 captures the digits of the exponent =back =head2 C<$RE{num}{dec}{-radix}{-places}{-sep}{-group}{-expon}> A synonym for C<< $RE{num}{real}{-base=>10}{...} >> =head2 C<$RE{num}{oct}{-radix}{-places}{-sep}{-group}{-expon}> A synonym for C<< $RE{num}{real}{-base=>8}{...} >> =head2 C<$RE{num}{bin}{-radix}{-places}{-sep}{-group}{-expon}> A synonym for C<< $RE{num}{real}{-base=>2}{...} >> =head2 C<$RE{num}{hex}{-radix}{-places}{-sep}{-group}{-expon}> A synonym for C<< $RE{num}{real}{-base=>16}{...} >> =head2 C<$RE{num}{decimal}{-base}{-radix}{-places}{-sep}{-group}> The same as C<$RE{num}{real}>, except that an exponent isn't allowed. Hence, this returns a pattern matching I numbers. If C<-base=I> is specified, the number is assumed to be in that base (with A..Z representing the digits for 11..36). By default, the base is 10. If C<-radix=I

> is specified, the pattern I

is used as the radix point for the number (i.e. the "decimal point" in base 10). The default is C. If C<-places=I> is specified, the number is assumed to have exactly I places after the radix point. If C<-places=I> is specified, the number is assumed to have between I and I places after the radix point. By default, the number of places is unrestricted. If C<-sep=I

> specified, the pattern I

is required as a grouping marker within the pre-radix section of the number. By default, no separator is allowed. If C<-group=I> is specified, digits between grouping separators must be grouped in sequences of exactly I characters. The default value of I is 3. For example: $RE{num}{decimal} # matches 123.456 or -0.1234567 $RE{num}{decimal}{-places=>2} # matches 123.45 or -0.12 $RE{num}{decimal}{-places=>'0,3'} # matches 123.456 or 0 or 9.8 $RE{num}{decimal}{-sep=>'[,.]?'} # matches 123,456 or 123.456 $RE{num}{decimal}{-base=>3'} # matches 121.102 Under C<-keep>: =over 4 =item $1 captures the entire match =item $2 captures the optional sign of the number =item $3 captures the complete mantissa =item $4 captures the whole number portion of the mantissa =item $5 captures the radix point =item $6 captures the fractional portion of the mantissa =back =head2 C<$RE{num}{square}> Returns a pattern that matches a (decimal) square. Because Perl's arithmetic is lossy when using integers over about 53 bits, this pattern only recognizes numbers less than 9000000000000000, if one uses a Perl that is configured to use 64 bit integers. Otherwise, the limit is 2147483647. These restrictions were introduced in versions 2.116 and 2.117 of Regexp::Common. Regardless whether C<-keep> was set, the matched number will be returned in C<$1>. This pattern is available for version 5.008 and up. =head2 C<$RE{num}{roman}> Returns a pattern that matches an integer written in Roman numbers. Case doesn't matter. Only the more modern style, that is, no more than three repetitions of a letter, is recognized. The largest number matched is I, or 3999. Larger numbers cannot be expressed using ASCII characters. A future version will be able to deal with the Unicode symbols to match larger Roman numbers. Under C<-keep>, the number will be captured in $1. =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2013, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/profanity.pm000644 000765 000024 00000010551 12654447171 023143 0ustar00abigailstaff000000 000000 package Regexp::Common::profanity; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $profanity = '(?:cvff(?:\\ gnxr|\\-gnxr|gnxr|r(?:ef|[feq])|vat|l)?|dhvzf?|fuvg(?:g(?:r(?:ef|[qe])|vat|l)|r(?:ef|[fqel])|vat|[fr])?|g(?:heqf?|jngf?)|jnax(?:r(?:ef|[eq])|vat|f)?|n(?:ef(?:r(?:\\ ubyr|\\-ubyr|ubyr|[fq])|vat|r)|ff(?:\\ ubyrf?|\\-ubyrf?|rq|ubyrf?|vat))|o(?:hyy(?:\\ fuvg(?:g(?:r(?:ef|[qe])|vat)|f)?|\\-fuvg(?:g(?:r(?:ef|[qe])|vat)|f)?|fuvg(?:g(?:r(?:ef|[qe])|vat)|f)?)|ybj(?:\\ wbof?|\\-wbof?|wbof?))|p(?:bpx(?:\\ fhpx(?:ref?|vat)|\\-fhpx(?:ref?|vat)|fhpx(?:ref?|vat))|enc(?:c(?:r(?:ef|[eq])|vat|l)|f)?|h(?:agf?|z(?:vat|zvat|f)))|qvpx(?:\\ urnq|\\-urnq|rq|urnq|vat|yrff|f)|s(?:hpx(?:rq|vat|f)?|neg(?:r[eq]|vat|[fl])?|rygpu(?:r(?:ef|[efq])|vat)?)|un(?:eq[\\-\\ ]?ba|ys(?:\\ n[fe]|\\-n[fe]|n[fe])frq)|z(?:bgure(?:\\ shpx(?:ref?|vat)|\\-shpx(?:ref?|vat)|shpx(?:ref?|vat))|hgu(?:n(?:\\ shpx(?:ref?|vat|[nnn])|\\-shpx(?:ref?|vat|[nnn])|shpx(?:ref?|vat|[nnn]))|re(?:\\ shpx(?:ref?|vat)|\\-shpx(?:ref?|vat)|shpx(?:ref?|vat)))|reqr?))'; my $contextual = '(?:c(?:bex|e(?:bax|vpxf?)|hff(?:vrf|l)|vff(?:\\ gnxr|\\-gnxr|gnxr|r(?:ef|[feq])|vat|l)?)|dhvzf?|ebbg(?:r(?:ef|[eq])|vat|f)?|f(?:bq(?:q(?:rq|vat)|f)?|chax|perj(?:rq|vat|f)?|u(?:nt(?:t(?:r(?:ef|[qe])|vat)|f)?|vg(?:g(?:r(?:ef|[qe])|vat|l)|r(?:ef|[fqel])|vat|[fr])?))|g(?:heqf?|jngf?|vgf?)|jnax(?:r(?:ef|[eq])|vat|f)?|n(?:ef(?:r(?:\\ ubyr|\\-ubyr|ubyr|[fq])|vat|r)|ff(?:\\ ubyrf?|\\-ubyrf?|rq|ubyrf?|vat))|o(?:ba(?:r(?:ef|[fe])|vat|r)|h(?:ttre|yy(?:\\ fuvg(?:g(?:r(?:ef|[qe])|vat)|f)?|\\-fuvg(?:g(?:r(?:ef|[qe])|vat)|f)?|fuvg(?:g(?:r(?:ef|[qe])|vat)|f)?))|n(?:fgneq|yy(?:r(?:ef|[qe])|vat|f)?)|yb(?:bql|j(?:\\ wbof?|\\-wbof?|wbof?)))|p(?:bpx(?:\\ fhpx(?:ref?|vat)|\\-fhpx(?:ref?|vat)|fhpx(?:ref?|vat)|f)?|enc(?:c(?:r(?:ef|[eq])|vat|l)|f)?|h(?:agf?|z(?:vat|zvat|f)))|q(?:batf?|vpx(?:\\ urnq|\\-urnq|rq|urnq|vat|yrff|f)?)|s(?:hpx(?:rq|vat|f)?|neg(?:r[eq]|vat|[fl])?|rygpu(?:r(?:ef|[efq])|vat)?)|u(?:hzc(?:r(?:ef|[eq])|vat|f)?|n(?:eq[\\-\\ ]?ba|ys(?:\\ n[fe]|\\-n[fe]|n[fe])frq))|z(?:bgure(?:\\ shpx(?:ref?|vat)|\\-shpx(?:ref?|vat)|shpx(?:ref?|vat))|hgu(?:n(?:\\ shpx(?:ref?|vat|[nnn])|\\-shpx(?:ref?|vat|[nnn])|shpx(?:ref?|vat|[nnn]))|re(?:\\ shpx(?:ref?|vat)|\\-shpx(?:ref?|vat)|shpx(?:ref?|vat)))|reqr?))'; tr/A-Za-z/N-ZA-Mn-za-m/ foreach $profanity, $contextual; pattern name => [qw (profanity)], create => '(?:\b(?k:' . $profanity . ')\b)', ; pattern name => [qw (profanity contextual)], create => '(?:\b(?k:' . $contextual . ')\b)', ; 1; __END__ =pod =head1 NAME Regexp::Common::profanity -- provide regexes for profanity =head1 SYNOPSIS use Regexp::Common qw /profanity/; while (<>) { /$RE{profanity}/ and print "Contains profanity\n"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. =head2 $RE{profanity} Returns a pattern matching words -- such as Carlin's "big seven" -- that are most likely to give offense. Note that correct anatomical terms are deliberately I included in the list. Under C<-keep> (see L): =over 4 =item $1 captures the entire word =back =head2 C<$RE{profanity}{contextual}> Returns a pattern matching words that are likely to give offense when used in specific contexts, but which also have genuinely non-offensive meanings. Under C<-keep> (see L): =over 4 =item $1 captures the entire word =back =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/SEN.pm000644 000765 000024 00000006731 12654447171 021562 0ustar00abigailstaff000000 000000 package Regexp::Common::SEN; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; =begin does_not_exist sub par11 { my $string = shift; my $sum = 0; for my $i (0 .. length ($string) - 1) { my $c = substr ($string, $i, 1); $sum += $c * (length ($string) - $i) } !($sum % 11) } =end does_not_exist =cut # http://www.ssa.gov/history/ssn/geocard.html pattern name => [qw /SEN USA SSN -sep=-/], create => sub { my $sep = $_ [1] {-sep}; "(?k:(?k:[1-9][0-9][0-9]|0[1-9][0-9]|00[1-9])$sep" . "(?k:[1-9][0-9]|0[1-9])$sep" . "(?k:[1-9][0-9][0-9][0-9]|0[1-9][0-9][0-9]|" . "00[1-9][0-9]|000[1-9]))" }, ; =begin does_not_exist It's not clear whether this is the right checksum. # http://www.google.nl/search?q=cache:8m1zKNYrEO0J:www.enschede.nl/nieuw/projecten/aanbesteding/integratie/pve%2520Bijlage%25207.5.doc+Sofi+nummer+formaat&hl=en&start=56&lr=lang_en|lang_nl&ie=UTF-8 pattern name => [qw /SEN Netherlands SoFi/], create => sub { # 9 digits (d1 d2 d3 d4 d5 d6 d7 d8 d9) # 9*d1 + 8*d2 + 7*d3 + 6*d4 + 5*d5 + 4*d6 + 3*d7 + 2*d8 + 1*d9 # == 0 mod 11. qr /([0-9]{9})(?(?{par11 ($^N)})|(?!))/; } ; =end does_not_exist =cut 1; __END__ =pod =head1 NAME Regexp::Common::SEN -- provide regexes for Social-Economical Numbers. =head1 SYNOPSIS use Regexp::Common qw /SEN/; while (<>) { /^$RE{SEN}{USA}{SSN}$/ and print "Social Security Number\n"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. =head2 C<$RE{SEN}{USA}{SSN}{-sep}> Returns a pattern that matches an American Social Security Number (SSN). SSNs consist of three groups of numbers, separated by a hypen (C<->). This pattern only checks for a valid structure, that is, it validates whether a number is valid SSN, was a valid SSN, or maybe a valid SSN in the future. There are almost a billion possible SSNs, and about 400 million are in use, or have been in use. If C<-sep=I

> is specified, the pattern I

is used as the separator between the groups of numbers. Under C<-keep> (see L): =over 4 =item $1 captures the entire SSN. =item $2 captures the first group of digits (the area number). =item $3 captures the second group of digits (the group number). =item $4 captures the third group of digits (the serial number). =back =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHORS Damian Conway and Abigail. =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/000755 000765 000024 00000000000 12654450573 021227 5ustar00abigailstaff000000 000000 Regexp-Common-2016020301/lib/Regexp/Common/URI.pm000644 000765 000024 00000006540 12654447171 021572 0ustar00abigailstaff000000 000000 package Regexp::Common::URI; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use Exporter (); use vars qw /@EXPORT_OK @ISA/; @ISA = qw /Exporter/; @EXPORT_OK = qw /register_uri/; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; # Use 'require' here, not 'use', so we delay running them after we are compiled. # We also do it using an 'eval'; this saves us from have repeated similar # lines. The eval is further explained in 'perldoc -f require'. my @uris = qw /fax file ftp gopher http pop prospero news tel telnet tv wais/; foreach my $uri (@uris) { eval "require Regexp::Common::URI::$uri"; die $@ if $@; } my %uris; sub register_uri { my ($scheme, $uri) = @_; $uris {$scheme} = $uri; } pattern name => [qw (URI)], create => sub {my $uri = join '|' => values %uris; $uri =~ s/\(\?k:/(?:/g; "(?k:$uri)"; }, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI -- provide patterns for URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{HTTP}/ and print "Contains an HTTP URI.\n"; } =head1 DESCRIPTION Patterns for the following URIs are supported: fax, file, FTP, gopher, HTTP, news, NTTP, pop, prospero, tel, telnet, tv and WAIS. Each is documented in the I>, manual page, for the appropriate scheme (in lowercase), except for I URIs which are found in I. =head2 C<$RE{URI}> Return a pattern that recognizes any of the supported URIs. With C<{-keep}>, only the entire URI is returned (in C<$1>). =head1 REFERENCES =over 4 =item B<[DRAFT-URI-TV]> Zigmond, D. and Vickers, M: I. December 2000. =item B<[DRAFT-URL-FTP]> Casey, James: I. November 1996. =item B<[RFC 1035]> Mockapetris, P.: I. November 1987. =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =item B<[RFC 2396]> Berners-Lee, Tim, Fielding, R., and Masinter, L.: I. August 1998. =item B<[RFC 2616]> Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P. and Berners-Lee, Tim: I. June 1999. =item B<[RFC 2806]> Vaha-Sipila, A.: I. April 2000. =back =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/whitespace.pm000644 000765 000024 00000003633 12654447171 023267 0ustar00abigailstaff000000 000000 package Regexp::Common::whitespace; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; pattern name => [qw (ws crop)], create => '(?:^\s+|\s+$)', subs => sub {$_[1] =~ s/^\s+//; $_[1] =~ s/\s+$//;} ; 1; __END__ =pod =head1 NAME Regexp::Common::whitespace -- provides a regex for leading or trailing whitescape =head1 SYNOPSIS use Regexp::Common qw /whitespace/; while (<>) { s/$RE{ws}{crop}//g; # Delete surrounding whitespace } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. =head2 C<$RE{ws}{crop}> Returns a pattern that identifies leading or trailing whitespace. For example: $str =~ s/$RE{ws}{crop}//g; # Delete surrounding whitespace The call: $RE{ws}{crop}->subs($str); is optimized (but probably still slower than doing the s///g explicitly). This pattern does not capture under C<-keep>. =head1 SEE ALSO L for a general description of how to use this interface. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. For a start, there are many common regexes missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/zip.pm000644 000765 000024 00000052354 12654447171 021741 0ustar00abigailstaff000000 000000 package Regexp::Common::zip; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; # # Prefer '[0-9]' over \d, because the latter may include more # in Unicode string. # my %code = ( Australia => [qw /AUS? AU AUS/], Belgium => [qw /BE? BE B/], Denmark => [qw /DK DK DK/], France => [qw /FR? FR F/], Germany => [qw /DE? DE D/], Greenland => [qw /DK DK DK/], Italy => [qw /IT? IT I/], Netherlands => [qw /NL NL NL/], Norway => [qw /NO? NO N/], Spain => [qw /ES? ES E/], USA => [qw /USA? US USA/], ); # Returns the empty string if the argument is undefined, the argument otherwise. sub __ {defined $_ [0] ? $_ [0] : ""} # Used for allowable options. If the value starts with 'y', the option is # required ("{1,1}" is returned, if the value starts with 'n', the option # is disallowed ("{0,0}" is returned), otherwise, the option is allowed, # but not required ("{0,1}" is returned). sub _t { if (defined $_ [0]) { if ($_ [0] =~ /^y/i) {return "{1,1}"} if ($_ [0] =~ /^n/i) {return "{0,0}"} } "{0,1}" } # Returns the (sub)pattern for the country named '$name', and the # -country option '$country'. sub _c { my ($name, $country) = @_; if (defined $country && $country ne "") { if ($country eq 'iso') {return $code {$name} [1]} if ($country eq 'cept') {return $code {$name} [2]} return $country; } $code {$name} [0] } my %zip = ( Australia => "(?k:(?k:[1-8][0-9]|9[0-7]|0?[28]|0?9(?=09))(?k:[0-9]{2}))", # Postal codes of the form 'DDDD', with the first # two digits 02, 08 or 20-97. Leading 0 may be omitted. # 909 and 0909 are valid as well - but no other postal # codes starting with 9 or 09. Belgium => "(?k:(?k:[1-9])(?k:[0-9]{3}))", # Postal codes of the form: 'DDDD', with the first # digit representing the province; the others # distribution sectors. Postal codes do not start # with a zero. Denmark => "(?k:(?k:[1-9])(?k:[0-9])(?k:[0-9]{2}))", # Postal codes of the form: 'DDDD', with the first # digit representing the distribution region, the # second digit the distribution district. Postal # codes do not start with a zero. Postal codes # starting with '39' are in Greenland. France => "(?k:(?k:[0-8][0-9]|9[0-8])(?k:[0-9]{3}))", # Postal codes of the form: 'DDDDD'. All digits are used. # First two digits indicate the department, and range # from 01 to 98, or 00 for army. Germany => "(?k:(?k:[0-9])(?k:[0-9])(?k:[0-9]{3}))", # Postal codes of the form: 'DDDDD'. All digits are used. # First digit is the distribution zone, second a # distribution region. Other digits indicate the # distribution district and postal town. Greenland => "(?k:(?k:39)(?k:[0-9]{2}))", # Postal codes of Greenland are part of the Danish # system. Codes in Greenland start with 39. Italy => "(?k:(?k:[0-9])(?k:[0-9])(?k:[0-9])(?k:[0-9])(?k:[0-9]))", # First digit: region. # Second digit: province. # Third digit: capital/province (odd for capital). # Fourth digit: route. # Fifth digit: place on route (0 for small places) Norway => "(?k:[0-9]{4})", # Four digits, no significance (??). Spain => "(?k:(?k:0[1-9]|[1-4][0-9]|5[0-2])(?k:[0-9])(?k:[0-9]{2}))", # Five digits, first two indicate the province. # Third digit: large town, main delivery rounds. # Last 2 digits: delivery area, secondary delivery route # or link to rural areas. Switzerland => "(?k:[1-9][0-9]{3})", # Four digits, first is district, second is area, # third is route, fourth is post office number. ); my %alternatives = ( Australia => [qw /Australian/], France => [qw /French/], Germany => [qw /German/], ); while (my ($country, $zip) = each %zip) { my @names = ($country); push @names => @{$alternatives {$country}} if $alternatives {$country}; foreach my $name (@names) { my $pat_name = $name eq "Denmark" && $] < 5.00503 ? [zip => $name, qw /-country=/] : [zip => $name, qw /-prefix= -country=/]; pattern name => $pat_name, create => sub { my $pt = _t $_ [1] {-prefix}; my $cn = _c $country => $_ [1] {-country}; my $pfx = "(?:(?k:$cn)-)"; "(?k:$pfx$pt$zip)"; }, ; } } # Postal codes of the form 'DDDD LL', with F, I, O, Q, U and Y not # used, SA, SD and SS unused combinations, and the first digit # cannot be 0. No specific meaning to the letters or digits. foreach my $country (qw /Netherlands Dutch/) { pattern name => ['zip', $country => qw /-prefix= -country=/, "-sep= "], create => sub { my $pt = _t $_ [1] {-prefix}; # Unused letters: F, I, O, Q, U, Y. # Unused combinations: SA, SD, SS. my $num = '[1-9][0-9]{3}'; my $let = '[A-EGHJ-NPRTVWXZ][A-EGHJ-NPRSTVWXZ]|' . 'S[BCEGHJ-NPRTVWXZ]'; my $sep = __ $_ [1] {-sep}; my $cn = _c Netherlands => $_ [1] {-country}; my $pfx = "(?:(?k:$cn)-)"; "(?k:$pfx$pt(?k:(?k:$num)(?k:$sep)(?k:$let)))"; }, ; } # Postal codes of the form 'DDDDD' or 'DDDDD-DDDD'. All digits are used, # none carry any specific meaning. pattern name => [qw /zip US -prefix= -country= -extended= -sep=-/], create => sub { my $pt = _t $_ [1] {-prefix}; my $et = _t $_ [1] {-extended}; my $sep = __ $_ [1] {-sep}; my $cn = _c USA => $_ [1] {-country}; my $pfx = "(?:(?k:$cn)-)"; # my $zip = "(?k:[0-9]{5})"; # my $ext = "(?:(?k:$sep)(?k:[0-9]{4}))"; my $zip = "(?k:(?k:[0-9]{3})(?k:[0-9]{2}))"; my $ext = "(?:(?k:$sep)(?k:(?k:[0-9]{2})(?k:[0-9]{2})))"; "(?k:$pfx$pt(?k:$zip$ext$et))"; }, version => 5.00503, ; # pattern name => [qw /zip British/, "-sep= "], # create => sub { # my $sep = $_ [1] -> {-sep}; # # my $london = '(?:EC[1-4]|WC[12]|S?W1)[A-Z]'; # my $single = '[BGLMS][0-9]{1,2}'; # my $double = '[A-Z]{2}[0-9]{1,2}'; # # my $left = "(?:$london|$single|$double)"; # my $right = '[0-9][ABD-HJLNP-UW-Z]{2}'; # # "(?k:(?k:$left)(?k:$sep)(?k:$right))"; # }, # ; # # pattern name => [qw /zip Canadian/, "-sep= "], # create => sub { # my $sep = $_ [1] -> {-sep}; # # my $left = '[A-Z][0-9][A-Z]'; # my $right = '[0-9][A-Z][0-9]'; # # "(?k:(?k:$left)(?k:$sep)(?k:$right))"; # }, # ; 1; __END__ =pod =head1 NAME Regexp::Common::zip -- provide regexes for postal codes. =head1 SYNOPSIS use Regexp::Common qw /zip/; while (<>) { /^$RE{zip}{Netherlands}$/ and print "Dutch postal code\n"; } =head1 DESCRIPTION Please consult the manual of L for a general description of the works of this interface. Do not use this module directly, but load it via I. This module offers patterns for zip or postal codes of many different countries. They all have the form C<$RE{zip}{Country}[{options}]>. The following common options are used: =head2 C<{-prefix=[yes|no|allow]}> and C<{-country=PAT}>. Postal codes can be prefixed with a country abbreviation. That is, a dutch postal code of B<1234 AB> can also be written as B. By default, all the patterns will allow the prefixes. But this can be changed with the C<-prefix> option. With C<-prefix=yes>, the returned pattern requires a country prefix, while C<-prefix=no> disallows a prefix. Any argument that doesn't start with a C or a C allows a country prefix, but doesn't require them. The prefixes used are, unfortunally, not always the same. Officially, ISO country codes need to be used, but the usage of I codes (the same ones as used on cars) is common too. By default, each postal code will recognize a country prefix that's either the ISO standard or the CEPT code. That is, German postal codes may prefixed with either C or C. The recognized prefix can be changed with the C<-country> option, which takes a (sub)pattern as argument. The arguments C and C are special, and indicate the language prefix should be the ISO country code, or the CEPT code. Examples: /$RE{zip}{Netherlands}/; # Matches '1234 AB' and 'NL-1234 AB'. /$RE{zip}{Netherlands}{-prefix => 'no'}/; # Matches '1234 AB' but not 'NL-1234 AB'. /$RE{zip}{Netherlands}{-prefix => 'yes'}/; # Matches 'NL-1234 AB' but not '1234 AB'. /$RE{zip}{Germany}/; # Matches 'DE-12345' and 'D-12345'. /$RE{zip}{Germany}{-country => 'iso'}/; # Matches 'DE-12345' but not 'D-12345'. /$RE{zip}{Germany}{-country => 'cept'}/; # Matches 'D-12345' but not 'DE-12345'. /$RE{zip}{Germany}{-country => 'GER'}/; # Matches 'GER-12345'. =head2 C<{-sep=PAT}> Some countries have postal codes that consist of two parts. Typically there is an official way of separating those parts; but in practise people tend to use different separators. For instance, if the official way to separate parts is to use a space, it happens that the space is left off. The C<-sep> option can be given a pattern as argument which indicates what to use as a separator between the parts. Examples: /$RE{zip}{Netherlands}/; # Matches '1234 AB' but not '1234AB'. /$RE{zip}{Netherlands}{-sep => '\s*'}/; # Matches '1234 AB' and '1234AB'. =head2 C<$RE{zip}{Australia}> Returns a pattern that recognizes Australian postal codes. Australian postal codes consist of four digits; the first two digits, which range from '10' to '97', indicate the state. Territories use '02' or '08' as starting digits; the leading zero is optional. '0909' is the only postal code starting with '09' (the leading zero is optional here as well) - this is the postal code for the Nothern Territory University). The (optional) country prefixes are I (ISO country code) and I (CEPT code). Regexp::Common 2.107 and before used C<$RE{zip}{Australia}>. This is still supported. If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The state or territory. =item $5 The last two digits. =back =head2 C<$RE{zip}{Belgium}> Returns a pattern than recognizes Belgian postal codes. Belgian postal codes consist of 4 digits, of which the first indicates the province. The (optional) country prefixes are I (ISO country code) and I (CEPT code). If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The digit indicating the province. =item $5 The last three digits of the postal code. =back =head2 C<$RE{zip}{Denmark}> Returns a pattern that recognizes Danish postal codes. Danish postal codes consist of four numbers; the first digit (which cannot be 0), indicates the distribution region, the second the distribution district. The (optional) country prefix is I, which is both the ISO country code and the CEPT code. If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The digit indicating the distribution region. =item $5 The digit indicating the distribution district. =item $6 The last two digits of the postal code. =back =head2 C<$RE{zip}{France}> Returns a pattern that recognizes French postal codes. French postal codes consist of five numbers; the first two numbers, which range from '01' to '98', indicate the department. The (optional) country prefixes are I (ISO country code) and I (CEPT code). Regexp::Common 2.107 and before used C<$RE{zip}{French}>. This is still supported. If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The department. =item $5 The last three digits. =back =head2 C<$RE{zip}{Germany}> Returns a pattern that recognizes German postal codes. German postal codes consist of five numbers; the first number indicating the distribution zone, the second the distribution region, while the latter three indicate the distribution district and the postal town. The (optional) country prefixes are I (ISO country code) and I (CEPT code). Regexp::Common 2.107 and before used C<$RE{zip}{German}>. This is still supported. If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The distribution zone. =item $5 The distribution region. =item $6 The distribution district and postal town. =back =head2 C<$RE{zip}{Greenland}> Returns a pattern that recognizes postal codes from Greenland. Greenland, being part of Denmark, uses Danish postal codes. All postal codes of Greenland start with 39. The (optional) country prefix is I, which is both the ISO country code and the CEPT code. If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 39, being the distribution region and distribution district for Greenland. =item $5 The last two digits of the postal code. =back =head2 C<$RE{zip}{Italy}> Returns a pattern recognizing Italian postal codes. Italian postal codes consist of 5 digits. The first digit indicates the region, the second the province. The third digit is odd for province capitals, and even for the province itself. The fourth digit indicates the route, and the fifth a place on the route (0 for small places, alphabetically for the rest). The country prefix is either I (the ISO country code), or I (the CEPT code). If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The region. =item $5 The province. =item $6 Capital or province. =item $7 The route. =item $8 The place on the route. =back =head2 C<$RE{zip}{Netherlands}> Returns a pattern that recognizes Dutch postal codes. Dutch postal codes consist of 4 digits and 2 letters, separated by a space. The separator can be changed using the C<{-sep}> option, as discussed above. The (optional) country prefix is I, which is both the ISO country code and the CEPT code. Regexp::Common 2.107 and earlier used C<$RE{zip}{Dutch}>. This is still supported. If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The digits part of the postal code. =item $5 The separator between the digits and the letters. =item $6 The letters part of the postal code. =back =head2 C<< $RE{zip}{Norway} >> Returns a pattern that recognizes Norwegian postal codes. Norwegian postal codes consist of four digits. The country prefix is either I (the ISO country code), or I (the CEPT code). If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =back =head2 C<< $RE{zip}{Spain} >> Returns a pattern that recognizes Spanish postal codes. Spanish postal codes consist of 5 digits. The first 2 indicate one of Spains fifties provinces (in alphabetical order), starting with C<00>. The third digit indicates a main city or the main delivery rounds. The last two digits are the delivery area, secondary delivery route or a link to rural areas. The country prefix is either I (the ISO country code), or I (the CEPT code). If C<{-keep}> is used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The two digits indicating the province. =item $5 The digit indicating the main city or main delivery route. =item $6 The digits indicating the delivery area, secondary delivery route or a link to rural areas. =back =head2 C<< $RE{zip}{Switzerland} >> Returns a pattern that recognizes Swiss postal codes. Swiss postal codes consist of 4 digits. The first indicates the district, starting with 1. The second indicates the area, the third, the route, and the fourth the post office number. =head2 C<< $RE{zip}{US}{-extended => [yes|no|allow]} >> Returns a pattern that recognizes US zip codes. US zip codes consist of 5 digits, with an optional 4 digit extension. By default, extensions are allowed, but not required. This can be influenced by the C<-extended> option. If its argument starts with a C, extensions are required; if the argument starts with a C, extensions will not be recognized. If an extension is used, a dash is used to separate the main part from the extension, but this can be changed with the C<-sep> option. The country prefix is either I (the ISO country code), or I (the CEPT code). If C<{-keep}> is being used, the following variables will be set: =over 4 =item $1 The entire postal code. =item $2 The country code prefix. =item $3 The postal code without the country prefix. =item $4 The first 5 digits of the postal code. =item $5 The first three digits of the postal code, indicating a sectional center or a large city. New in Regexp::Common 2.119. =item $6 The last 2 digits of the 5 digit part of the postal code, indicating a post office facility or delivery area. New in Regexp::Common 2.119. =item $7 The separator between the 5 digit part and the 4 digit part. Up to Regexp::Common 2.118, this used to be $5. =item $8 The 4 digit part of the postal code (if any). Up to Regexp::Common 2.118, this used to be $6. =item $9 The first two digits of the 4 digit part of the postal code, indicating a sector, or several blocks. New in Regexp::Common 2.119. =item $10 The last two digits of the 4 digit part of the postal code, indicating a segment or one side of a street. New in Regexp::Common 2.119. =back You need at least version 5.005_03 to be able to use US postal codes. Older versions contain a bug that let the pattern match invalid US postal codes. =head3 Questions =over 4 =item Can the 5 digit part of the zip code (in theory) start with 000? =item Can the 5 digit part of the zip code (in theory) end with 00? =item Can the 4 digit part of the zip code (in theory) start with 00? =item Can the 4 digit part of the zip code (in theory) end with 00? =back =head1 SEE ALSO L for a general description of how to use this interface. =over 4 =item L Frank's compulsive guide to postal addresses. =item L Postal addressing systems. =item L Postal code information. =item L Links to Postcode Pages. =item L Information about Australian postal codes. =item L Information about US postal codes. =item L =back =head1 AUTHORS Damian Conway S<(I)> and Abigail S<(I)>. =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Zip codes for most countries are missing. Send them in to I. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/fax.pm000644 000765 000024 00000005324 12654447171 022347 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::fax; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC2806 qw /$fax_subscriber $fax_subscriber_no_future/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $fax_scheme = 'fax'; my $fax_uri = "(?k:(?k:$fax_scheme):(?k:$fax_subscriber))"; my $fax_uri_nf = "(?k:(?k:$fax_scheme):(?k:$fax_subscriber_no_future))"; register_uri $fax_scheme => $fax_uri; pattern name => [qw (URI fax)], create => $fax_uri ; pattern name => [qw (URI fax nofuture)], create => $fax_uri_nf ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::fax -- Returns a pattern for fax URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{fax}/ and print "Contains a fax URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{fax} Returns a pattern that matches I URIs, as defined by RFC 2806. Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The scheme. =item $3 The phone number, including any possible add-ons like ISDN subaddress, a post dial part, area specifier, service provider, etc. =back =head2 C<$RE{URI}{fax}{nofuture}> As above (including what's returned by C<{-keep}>), with the exception that I are not allowed. Without allowing those I, it becomes much easier to check a URI if the correct syntax for post dial, service provider, phone context, etc has been used - otherwise the regex could always classify them as a I. =head1 REFERENCES =over 4 =item B<[RFC 1035]> Mockapetris, P.: I. November 1987. =item B<[RFC 2396]> Berners-Lee, Tim, Fielding, R., and Masinter, L.: I. August 1998. =item B<[RFC 2806]> Vaha-Sipila, A.: I. April 2000. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/file.pm000644 000765 000024 00000004057 12654447171 022512 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::file; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC1738 qw /$host $fpath/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $scheme = 'file'; my $uri = "(?k:(?k:$scheme)://(?k:(?k:(?:$host|localhost)?)" . "(?k:/(?k:$fpath))))"; register_uri $scheme => $uri; pattern name => [qw (URI file)], create => $uri, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::file -- Returns a pattern for file URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{file}/ and print "Contains a file URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{file} Returns a pattern that matches I URIs, as defined by RFC 1738. File URIs have the form: "file:" "//" [ host | "localhost" ] "/" fpath Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The scheme. =item $3 The part of the URI following "file://". =item $4 The hostname. =item $5 The path name, including the leading slash. =item $6 The path name, without the leading slash. =back =head1 REFERENCES =over 4 =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/ftp.pm000644 000765 000024 00000012604 12654447171 022361 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::ftp; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC2396 qw /$host $port $ftp_segments $userinfo $userinfo_no_colon/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $ftp_uri = "(?k:(?k:ftp)://(?:(?k:$userinfo)(?k:)\@)?(?k:$host)" . "(?::(?k:$port))?(?k:/(?k:(?k:$ftp_segments)" . "(?:;type=(?k:[AIai]))?))?)"; my $ftp_uri_password = "(?k:(?k:ftp)://(?:(?k:$userinfo_no_colon)" . "(?::(?k:$userinfo_no_colon))?\@)?(?k:$host)" . "(?::(?k:$port))?(?k:/(?k:(?k:$ftp_segments)" . "(?:;type=(?k:[AIai]))?))?)"; register_uri FTP => $ftp_uri; pattern name => [qw (URI FTP), "-type=[AIai]", "-password="], create => sub { my $uri = exists $_ [1] -> {-password} && !defined $_ [1] -> {-password} ? $ftp_uri_password : $ftp_uri; my $type = $_ [1] -> {-type}; $uri =~ s/\[AIai\]/$type/; $uri; } ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::ftp -- Returns a pattern for FTP URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{FTP}/ and print "Contains an FTP URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{FTP}{-type}{-password}; Returns a regex for FTP URIs. Note: FTP URIs are not formally defined. RFC 1738 defines FTP URLs, but parts of that RFC have been obsoleted by RFC 2396. However, the differences between RFC 1738 and RFC 2396 are such that they aren't applicable straightforwardly to FTP URIs. There are two main problems: =over 4 =item Passwords. RFC 1738 allowed an optional username and an optional password (separated by a colon) in the FTP URL. Hence, colons were not allowed in either the username or the password. RFC 2396 strongly recommends passwords should not be used in URIs. It does allow for I instead. This userinfo part may contain colons, and hence contain more than one colon. The regexp returned follows the RFC 2396 specification, unless the I<{-password}> option is given; then the regex allows for an optional username and password, separated by a colon. =item The ;type specifier. RFC 1738 does not allow semi-colons in FTP path names, because a semi-colon is a reserved character for FTP URIs. The semi-colon is used to separate the path from the option I specifier. However, in RFC 2396, paths consist of slash separated segments, and each segment is a semi-colon separated group of parameters. Straigthforward application of RFC 2396 would mean that a trailing I specifier couldn't be distinguished from the last segment of the path having a two parameters, the last one starting with I. Therefore we have opted to disallow a semi-colon in the path part of an FTP URI. Furthermore, RFC 1738 allows three values for the type specifier, I, I and I (either upper case or lower case). However, the internet draft about FTP URIs B<[DRAFT-FTP-URL]> (which expired in May 1997) notes the lack of consistent implementation of the I parameter and drops I from the set of possible values. We follow this practise; however, RFC 1738 behaviour can be archieved by using the I<-type => "[ADIadi]"> parameter. =back FTP URIs have the following syntax: "ftp:" "//" [ userinfo "@" ] host [ ":" port ] [ "/" path [ ";type=" value ]] When using I<{-password}>, we have the syntax: "ftp:" "//" [ user [ ":" password ] "@" ] host [ ":" port ] [ "/" path [ ";type=" value ]] Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The scheme. =item $3 The userinfo, or if I<{-password}> is used, the username. =item $4 If I<{-password}> is used, the password, else C. =item $5 The hostname or IP address. =item $6 The port number. =item $7 The full path and type specification, including the leading slash. =item $8 The full path and type specification, without the leading slash. =item $9 The full path, without the type specification nor the leading slash. =item $10 The value of the type specification. =back =head1 REFERENCES =over 4 =item B<[DRAFT-URL-FTP]> Casey, James: I. November 1996. =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =item B<[RFC 2396]> Berners-Lee, Tim, Fielding, R., and Masinter, L.: I. August 1998. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/gopher.pm000644 000765 000024 00000010701 12654447171 023050 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::gopher; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC1738 qw /$host $port $uchars/; use Regexp::Common::URI::RFC1808 qw /$pchars $pchar_range/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $pchars_notab = "(?:(?:[$pchar_range]+|" . "%(?:[1-9a-fA-F][0-9a-fA-F]|0[0-8a-fA-F]))*)"; my $gopherplus_string = $pchars; my $search = $pchars; my $search_notab = $pchars_notab; my $selector = $pchars; my $selector_notab = $pchars_notab; my $gopher_type = "(?:[0-9+IgT])"; my $scheme = "gopher"; my $uri = "(?k:(?k:$scheme)://(?k:$host)(?::(?k:$port))?" . "/(?k:(?k:$gopher_type)(?k:$selector)))"; my $uri_notab = "(?k:(?k:$scheme)://(?k:$host)(?::(?k:$port))?" . "/(?k:(?k:$gopher_type)(?k:$selector_notab)" . "(?:%09(?k:$search_notab)(?:%09(?k:$gopherplus_string))?)?))"; register_uri $scheme => $uri; pattern name => [qw (URI gopher -notab=)], create => sub { exists $_ [1] {-notab} && !defined $_ [1] {-notab} ? $uri_notab : $uri}, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::gopher -- Returns a pattern for gopher URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{gopher}/ and print "Contains a gopher URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{gopher}{-notab} Gopher URIs are poorly defined. Originally, RFC 1738 defined gopher URIs, but they were later redefined in an internet draft. One that was expired in June 1997. The internet draft for gopher URIs defines them as follows: "gopher:" "//" host [ ":" port ] "/" gopher-type selector [ "%09" search [ "%09" gopherplus_string ]] Unfortunally, a I is defined in such a way that characters may be escaped using the URI escape mechanism. This includes tabs, which escaped are C<%09>. Hence, the syntax cannot distinguish between a URI that has both a I and a I part, and an URI where the I includes an escaped tab. (The text of the draft forbids tabs to be present in the I though). C<$RE{URI}{gopher}> follows the defined syntax. To disallow escaped tabs in the I and I parts, use C<$RE{URI}{gopher}{-notab}>. There are other differences between the text and the given syntax. According to the text, selector strings cannot have tabs, linefeeds or carriage returns in them. The text also allows the entire I, (the part after the slash following the hostport) to be empty; if this is empty the slash may be omitted as well. However, this isn't reflected in the syntax. Under C<{-keep}>, the following are returned: =over 4 =item $1 The entire URI. =item $2 The scheme. =item $3 The host (name or address). =item $4 The port (if any). =item $5 The "gopher-path", the part after the / following the host and port. =item $6 The gopher-type. =item $7 The selector. (When no C<{-notab}> is used, this includes the search and gopherplus_string, including the separating escaped tabs). =item $8 The search, if given. (Only when C<{-notab}> is given). =item $9 The gopherplus_string, if given. (Only when C<{-notab}> is given). =back head1 REFERENCES =over 4 =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =item B<[RFC 1808]> Fielding, R.: I. June 1995. =item B<[GOPHER URL]> Krishnan, Murali R., Casey, James: "A Gopher URL Format". Expired Internet draft I. December 1996. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/http.pm000644 000765 000024 00000005553 12654447171 022554 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::http; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC2396 qw /$host $port $path_segments $query/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $http_uri = "(?k:(?k:http)://(?k:$host)(?::(?k:$port))?" . "(?k:/(?k:(?k:$path_segments)(?:[?](?k:$query))?))?)"; my $https_uri = $http_uri; $https_uri =~ s/http/https?/; register_uri HTTP => $https_uri; pattern name => [qw (URI HTTP), "-scheme=http"], create => sub { my $scheme = $_ [1] -> {-scheme}; my $uri = $http_uri; $uri =~ s/http/$scheme/; $uri; } ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::http -- Returns a pattern for HTTP URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{HTTP}/ and print "Contains an HTTP URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{HTTP}{-scheme} Provides a regex for an HTTP URI as defined by RFC 2396 (generic syntax) and RFC 2616 (HTTP). If C<< -scheme => I

>> is specified the pattern I

is used as the scheme. By default I

is C. C and C are reasonable alternatives. The syntax for an HTTP URI is: "http:" "//" host [ ":" port ] [ "/" path [ "?" query ]] Under C<{-keep}>, the following are returned: =over 4 =item $1 The entire URI. =item $2 The scheme. =item $3 The host (name or address). =item $4 The port (if any). =item $5 The absolute path, including the query and leading slash. =item $6 The absolute path, including the query, without the leading slash. =item $7 The absolute path, without the query or leading slash. =item $8 The query, without the question mark. =back =head1 REFERENCES =over 4 =item B<[RFC 2396]> Berners-Lee, Tim, Fielding, R., and Masinter, L.: I. August 1998. =item B<[RFC 2616]> Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P. and Berners-Lee, Tim: I. June 1999. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/news.pm000644 000765 000024 00000005412 12654447171 022543 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::news; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC1738 qw /$grouppart $group $article $host $port $digits/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $news_scheme = 'news'; my $news_uri = "(?k:(?k:$news_scheme):(?k:$grouppart))"; my $nntp_scheme = 'nntp'; my $nntp_uri = "(?k:(?k:$nntp_scheme)://(?k:(?k:(?k:$host)(?::(?k:$port))?)" . "/(?k:$group)(?:/(?k:$digits))?))"; register_uri $news_scheme => $news_uri; register_uri $nntp_scheme => $nntp_uri; pattern name => [qw (URI news)], create => $news_uri, ; pattern name => [qw (URI NNTP)], create => $nntp_uri, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::news -- Returns a pattern for file URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{news}/ and print "Contains a news URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{news} Returns a pattern that matches I URIs, as defined by RFC 1738. News URIs have the form: "news:" ( "*" | group | article "@" host ) Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The scheme. =item $3 The part of the URI following "news://". =back =head2 $RE{URI}{NNTP} Returns a pattern that matches I URIs, as defined by RFC 1738. NNTP URIs have the form: "nntp://" host [ ":" port ] "/" group [ "/" digits ] Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The scheme. =item $3 The part of the URI following "nntp://". =item $4 The host and port, separated by a colon. If no port was given, just the host. =item $5 The host. =item $6 The port, if given. =item $7 The group. =item $8 The digits, if given. =back =head1 REFERENCES =over 4 =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/pop.pm000644 000765 000024 00000004054 12654447171 022366 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::pop; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC1738 qw /$host $port/; use Regexp::Common::URI::RFC2384 qw /$enc_user $enc_auth_type/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $scheme = "pop"; my $uri = "(?k:(?k:$scheme)://(?:(?k:$enc_user)" . "(?:;AUTH=(?k:[*]|$enc_auth_type))?\@)?" . "(?k:$host)(?::(?k:$port))?)"; register_uri $scheme => $uri; pattern name => [qw (URI POP)], create => $uri, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::pop -- Returns a pattern for POP URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{POP}/ and print "Contains a POP URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{POP} Returns a pattern that matches I URIs, as defined by RFC 2384. POP URIs have the form: "pop:" "//" [ user [ ";AUTH" ( "*" | auth_type ) ] "@" ] host [ ":" port ] Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The I. =item $3 The I, if given. =item $4 The I, if given (could be a I<*>). =item $5 The I. =item $6 The I, if given. =back =head1 REFERENCES =over 4 =item B<[RFC 2384]> Gellens, R.: I. August 1998. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Abigail. (I). =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/prospero.pm000644 000765 000024 00000004172 12654447171 023442 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::prospero; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC1738 qw /$host $port $ppath $fieldname $fieldvalue $fieldspec/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $scheme = 'prospero'; my $uri = "(?k:(?k:$scheme)://(?k:$host)(?::(?k:$port))?" . "/(?k:$ppath)(?k:$fieldspec*))"; register_uri $scheme => $uri; pattern name => [qw (URI prospero)], create => $uri, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::prospero -- Returns a pattern for prospero URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{prospero}/ and print "Contains a prospero URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{prospero} Returns a pattern that matches I URIs, as defined by RFC 1738. prospero URIs have the form: "prospero:" "//" host [ ":" port ] "/" path [ fieldspec ] * Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The I. =item $3 The I. =item $4 The I, if given. =item $5 The propero path. =item $6 The field specifications, if given. There can be more field specifications; they will all be returned in C<$6>. =back =head1 REFERENCES =over 4 =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Abigail. (I). =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/RFC1035.pm000644 000765 000024 00000004005 12654447171 022507 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::RFC1035; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; use vars qw /@EXPORT @EXPORT_OK %EXPORT_TAGS @ISA/; use Exporter (); @ISA = qw /Exporter/; my %vars; BEGIN { $vars {low} = [qw /$digit $letter $let_dig $let_dig_hyp $ldh_str/]; $vars {parts} = [qw /$label $subdomain/]; $vars {domain} = [qw /$domain/]; } use vars map {@$_} values %vars; @EXPORT = qw /$host/; @EXPORT_OK = map {@$_} values %vars; %EXPORT_TAGS = (%vars, ALL => [@EXPORT_OK]); # RFC 1035. $digit = "[0-9]"; $letter = "[A-Za-z]"; $let_dig = "[A-Za-z0-9]"; $let_dig_hyp = "[-A-Za-z0-9]"; $ldh_str = "(?:[-A-Za-z0-9]+)"; $label = "(?:$letter(?:(?:$ldh_str){0,61}$let_dig)?)"; $subdomain = "(?:$label(?:[.]$label)*)"; $domain = "(?: |(?:$subdomain))"; 1; __END__ =pod =head1 NAME Regexp::Common::URI::RFC1035 -- Definitions from RFC1035; =head1 SYNOPSIS use Regexp::Common::URI::RFC1035 qw /:ALL/; =head1 DESCRIPTION This package exports definitions from RFC1035. It's intended usage is for Regexp::Common::URI submodules only. Its interface might change without notice. =head1 REFERENCES =over 4 =item B<[RFC 1035]> Mockapetris, P.: I. November 1987. =back =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/RFC1738.pm000644 000765 000024 00000011345 12654447171 022526 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::RFC1738; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; use vars qw /@EXPORT @EXPORT_OK %EXPORT_TAGS @ISA/; use Exporter (); @ISA = qw /Exporter/; my %vars; BEGIN { $vars {low} = [qw /$digit $digits $hialpha $lowalpha $alpha $alphadigit $safe $extra $national $punctuation $unreserved $unreserved_range $reserved $uchar $uchars $xchar $xchars $hex $escape/]; $vars {connect} = [qw /$port $hostnumber $toplabel $domainlabel $hostname $host $hostport $user $password $login/]; $vars {parts} = [qw /$fsegment $fpath $group $article $grouppart $search $database $wtype $wpath $psegment $fieldname $fieldvalue $fieldspec $ppath/]; } use vars map {@$_} values %vars; @EXPORT = qw /$host/; @EXPORT_OK = map {@$_} values %vars; %EXPORT_TAGS = (%vars, ALL => [@EXPORT_OK]); # RFC 1738, base definitions. # Lowlevel definitions. $digit = '[0-9]'; $digits = '[0-9]+'; $hialpha = '[A-Z]'; $lowalpha = '[a-z]'; $alpha = '[a-zA-Z]'; # lowalpha | hialpha $alphadigit = '[a-zA-Z0-9]'; # alpha | digit $safe = '[-$_.+]'; $extra = "[!*'(),]"; $national = '[][{}|\\^~`]'; $punctuation = '[<>#%"]'; $unreserved_range = q [-a-zA-Z0-9$_.+!*'(),]; # alphadigit | safe | extra $unreserved = "[$unreserved_range]"; $reserved = '[;/?:@&=]'; $hex = '[a-fA-F0-9]'; $escape = "(?:%$hex$hex)"; $uchar = "(?:$unreserved|$escape)"; $uchars = "(?:(?:$unreserved|$escape)*)"; $xchar = "(?:[$unreserved_range;/?:\@&=]|$escape)"; $xchars = "(?:(?:[$unreserved_range;/?:\@&=]|$escape)*)"; # Connection related stuff. $port = "(?:$digits)"; $hostnumber = "(?:$digits\[.]$digits\[.]$digits\[.]$digits)"; $toplabel = "(?:$alpha\[-a-zA-Z0-9]*$alphadigit|$alpha)"; $domainlabel = "(?:(?:$alphadigit\[-a-zA-Z0-9]*)?$alphadigit)"; $hostname = "(?:(?:$domainlabel\[.])*$toplabel)"; $host = "(?:$hostname|$hostnumber)"; $hostport = "(?:$host(?::$port)?)"; $user = "(?:(?:[$unreserved_range;?&=]|$escape)*)"; $password = "(?:(?:[$unreserved_range;?&=]|$escape)*)"; $login = "(?:(?:$user(?::$password)?\@)?$hostport)"; # Parts (might require more if we add more URIs). # FTP/file $fsegment = "(?:(?:[$unreserved_range:\@&=]|$escape)*)"; $fpath = "(?:$fsegment(?:/$fsegment)*)"; # NNTP/news. $group = "(?:$alpha\[-A-Za-z0-9.+_]*)"; $article = "(?:(?:[$unreserved_range;/?:&=]|$escape)+" . '@' . "$host)"; $grouppart = "(?:[*]|$article|$group)"; # It's important that # $article goes before # $group. # WAIS. $search = "(?:(?:[$unreserved_range;:\@&=]|$escape)*)"; $database = $uchars; $wtype = $uchars; $wpath = $uchars; # prospero $psegment = "(?:(?:[$unreserved_range?:\@&=]|$escape)*)"; $fieldname = "(?:(?:[$unreserved_range?:\@&]|$escape)*)"; $fieldvalue = "(?:(?:[$unreserved_range?:\@&]|$escape)*)"; $fieldspec = "(?:;$fieldname=$fieldvalue)"; $ppath = "(?:$psegment(?:/$psegment)*)"; # # The various '(?:(?:[$unreserved_range ...]|$escape)*)' above need # some loop unrolling to speed up the match. # 1; __END__ =pod =head1 NAME Regexp::Common::URI::RFC1738 -- Definitions from RFC1738; =head1 SYNOPSIS use Regexp::Common::URI::RFC1738 qw /:ALL/; =head1 DESCRIPTION This package exports definitions from RFC1738. It's intended usage is for Regexp::Common::URI submodules only. Its interface might change without notice. =head1 REFERENCES =over 4 =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =back =head1 AUTHOR Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/RFC1808.pm000644 000765 000024 00000010455 12654447171 022525 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::RFC1808; BEGIN { # This makes sure 'use warnings' doesn't bomb out on 5.005_*; # warnings won't be enabled on those old versions though. if ($] < 5.006 && !exists $INC {"warnings.pm"}) { $INC {"warnings.pm"} = 1; no strict 'refs'; *{"warnings::unimport"} = sub {0}; } } use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; use vars qw /@EXPORT @EXPORT_OK %EXPORT_TAGS @ISA/; use Exporter (); @ISA = qw /Exporter/; my %vars; BEGIN { $vars {low} = [qw /$punctuation $reserved_range $reserved $national $extra $safe $digit $digits $hialpha $lowalpha $alpha $alphadigit $hex $escape $unreserved_range $unreserved $uchar $uchars $pchar_range $pchar $pchars/], $vars {parts} = [qw /$fragment $query $param $params $segment $fsegment $path $net_loc $scheme $rel_path $abs_path $net_path $relativeURL $generic_RL $absoluteURL $URL/], } use vars map {@$_} values %vars; @EXPORT = qw /$host/; @EXPORT_OK = map {@$_} values %vars; %EXPORT_TAGS = (%vars, ALL => [@EXPORT_OK]); # RFC 1808, base definitions. # Lowlevel definitions. $punctuation = '[<>#%"]'; $reserved_range = q [;/?:@&=]; $reserved = "[$reserved_range]"; $national = '[][{}|\\^~`]'; $extra = "[!*'(),]"; $safe = '[-$_.+]'; $digit = '[0-9]'; $digits = '[0-9]+'; $hialpha = '[A-Z]'; $lowalpha = '[a-z]'; $alpha = '[a-zA-Z]'; # lowalpha | hialpha $alphadigit = '[a-zA-Z0-9]'; # alpha | digit $hex = '[a-fA-F0-9]'; $escape = "(?:%$hex$hex)"; $unreserved_range = q [-a-zA-Z0-9$_.+!*'(),]; # alphadigit | safe | extra $unreserved = "[$unreserved_range]"; $uchar = "(?:$unreserved|$escape)"; $uchars = "(?:(?:$unreserved+|$escape)*)"; $pchar_range = qq [$unreserved_range:\@&=]; $pchar = "(?:[$pchar_range]|$escape)"; $pchars = "(?:(?:[$pchar_range]+|$escape)*)"; # Parts $fragment = "(?:(?:[$unreserved_range$reserved_range]+|$escape)*)"; $query = "(?:(?:[$unreserved_range$reserved_range]+|$escape)*)"; $param = "(?:(?:[$pchar_range/]+|$escape)*)"; $params = "(?:$param(?:;$param)*)"; $segment = "(?:(?:[$pchar_range]+|$escape)*)"; $fsegment = "(?:(?:[$pchar_range]+|$escape)+)"; $path = "(?:$fsegment(?:/$segment)*)"; $net_loc = "(?:(?:[$pchar_range;?]+|$escape)*)"; $scheme = "(?:(?:[-a-zA-Z0-9+.]+|$escape)+)"; $rel_path = "(?:$path?(?:;$params)?(?:?$query)?)"; $abs_path = "(?:/$rel_path)"; $net_path = "(?://$net_loc$abs_path?)"; $relativeURL = "(?:$net_path|$abs_path|$rel_path)"; $generic_RL = "(?:$scheme:$relativeURL)"; $absoluteURL = "(?:$generic_RL|" . "(?:$scheme:(?:[$unreserved_range$reserved_range]+|$escape)*))"; $URL = "(?:(?:$absoluteURL|$relativeURL)(?:#$fragment)?)"; 1; __END__ =pod =head1 NAME Regexp::Common::URI::RFC1808 -- Definitions from RFC1808; =head1 SYNOPSIS use Regexp::Common::URI::RFC1808 qw /:ALL/; =head1 DESCRIPTION This package exports definitions from RFC1808. It's intended usage is for Regexp::Common::URI submodules only. Its interface might change without notice. =head1 REFERENCES =over 4 =item B<[RFC 1808]> Fielding, R.: I. June 1995. =back =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/RFC2384.pm000644 000765 000024 00000004464 12654447171 022530 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::RFC2384; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI::RFC1738 qw /$unreserved_range $escape $hostport/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; use vars qw /@EXPORT @EXPORT_OK %EXPORT_TAGS @ISA/; use Exporter (); @ISA = qw /Exporter/; my %vars; BEGIN { $vars {low} = [qw /$achar_range $achar $achars $achar_more/]; $vars {connect} = [qw /$enc_sasl $enc_user $enc_ext $enc_auth_type $auth $user_auth $server/]; $vars {parts} = [qw /$pop_url/]; } use vars map {@$_} values %vars; @EXPORT = qw /$host/; @EXPORT_OK = map {@$_} values %vars; %EXPORT_TAGS = (%vars, ALL => [@EXPORT_OK]); # RFC 2384, POP3. # Lowlevel definitions. $achar_range = "$unreserved_range&=~"; $achar = "(?:[$achar_range]|$escape)"; $achars = "(?:(?:[$achar_range]+|$escape)*)"; $achar_more = "(?:(?:[$achar_range]+|$escape)+)"; $enc_sasl = $achar_more; $enc_user = $achar_more; $enc_ext = "(?:[+](?:APOP|$achar_more))"; $enc_auth_type = "(?:$enc_sasl|$enc_ext)"; $auth = "(?:;AUTH=(?:[*]|$enc_auth_type))"; $user_auth = "(?:$enc_user$auth?)"; $server = "(?:(?:$user_auth\@)?$hostport)"; $pop_url = "(?:pop://$server)"; 1; __END__ =pod =head1 NAME Regexp::Common::URI::RFC2384 -- Definitions from RFC2384; =head1 SYNOPSIS use Regexp::Common::URI::RFC2384 qw /:ALL/; =head1 DESCRIPTION This package exports definitions from RFC2384. It's intended usage is for Regexp::Common::URI submodules only. Its interface might change without notice. =head1 REFERENCES =over 4 =item B<[RFC 2384]> Gellens, R.: I August 1998. =back =head1 AUTHOR Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/RFC2396.pm000644 000765 000024 00000012076 12654447171 022531 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::RFC2396; use Regexp::Common qw /pattern clean no_defaults/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; use vars qw /@EXPORT @EXPORT_OK %EXPORT_TAGS @ISA/; use Exporter (); @ISA = qw /Exporter/; my %vars; BEGIN { $vars {low} = [qw /$digit $upalpha $lowalpha $alpha $alphanum $hex $escaped $mark $unreserved $reserved $pchar $uric $urics $userinfo $userinfo_no_colon $uric_no_slash/]; $vars {parts} = [qw /$query $fragment $param $segment $path_segments $ftp_segments $rel_segment $abs_path $rel_path $path/]; $vars {connect} = [qw /$port $IPv4address $toplabel $domainlabel $hostname $host $hostport $server $reg_name $authority/]; $vars {URI} = [qw /$scheme $net_path $opaque_part $hier_part $relativeURI $absoluteURI $URI_reference/]; } use vars map {@$_} values %vars; @EXPORT = (); @EXPORT_OK = map {@$_} values %vars; %EXPORT_TAGS = (%vars, ALL => [@EXPORT_OK]); # RFC 2396, base definitions. $digit = '[0-9]'; $upalpha = '[A-Z]'; $lowalpha = '[a-z]'; $alpha = '[a-zA-Z]'; # lowalpha | upalpha $alphanum = '[a-zA-Z0-9]'; # alpha | digit $hex = '[a-fA-F0-9]'; $escaped = "(?:%$hex$hex)"; $mark = "[\\-_.!~*'()]"; $unreserved = "[a-zA-Z0-9\\-_.!~*'()]"; # alphanum | mark # %61-%7A, %41-%5A, %30-%39 # a - z A - Z 0 - 9 # %21, %27, %28, %29, %2A, %2D, %2E, %5F, %7E # ! ' ( ) * - . _ ~ $reserved = "[;/?:@&=+\$,]"; $pchar = "(?:[a-zA-Z0-9\\-_.!~*'():\@&=+\$,]|$escaped)"; # unreserved | escaped | [:@&=+$,] $uric = "(?:[;/?:\@&=+\$,a-zA-Z0-9\\-_.!~*'()]|$escaped)"; # reserved | unreserved | escaped $urics = "(?:(?:[;/?:\@&=+\$,a-zA-Z0-9\\-_.!~*'()]+|" . "$escaped)*)"; $query = $urics; $fragment = $urics; $param = "(?:(?:[a-zA-Z0-9\\-_.!~*'():\@&=+\$,]+|$escaped)*)"; $segment = "(?:$param(?:;$param)*)"; $path_segments = "(?:$segment(?:/$segment)*)"; $ftp_segments = "(?:$param(?:/$param)*)"; # NOT from RFC 2396. $rel_segment = "(?:(?:[a-zA-Z0-9\\-_.!~*'();\@&=+\$,]*|$escaped)+)"; $abs_path = "(?:/$path_segments)"; $rel_path = "(?:$rel_segment(?:$abs_path)?)"; $path = "(?:(?:$abs_path|$rel_path)?)"; $port = "(?:$digit*)"; $IPv4address = "(?:$digit+[.]$digit+[.]$digit+[.]$digit+)"; $toplabel = "(?:$alpha"."[-a-zA-Z0-9]*$alphanum|$alpha)"; $domainlabel = "(?:(?:$alphanum"."[-a-zA-Z0-9]*)?$alphanum)"; $hostname = "(?:(?:$domainlabel\[.])*$toplabel\[.]?)"; $host = "(?:$hostname|$IPv4address)"; $hostport = "(?:$host(?::$port)?)"; $userinfo = "(?:(?:[a-zA-Z0-9\\-_.!~*'();:&=+\$,]+|$escaped)*)"; $userinfo_no_colon = "(?:(?:[a-zA-Z0-9\\-_.!~*'();&=+\$,]+|$escaped)*)"; $server = "(?:(?:$userinfo\@)?$hostport)"; $reg_name = "(?:(?:[a-zA-Z0-9\\-_.!~*'()\$,;:\@&=+]*|$escaped)+)"; $authority = "(?:$server|$reg_name)"; $scheme = "(?:$alpha"."[a-zA-Z0-9+\\-.]*)"; $net_path = "(?://$authority$abs_path?)"; $uric_no_slash = "(?:[a-zA-Z0-9\\-_.!~*'();?:\@&=+\$,]|$escaped)"; $opaque_part = "(?:$uric_no_slash$urics)"; $hier_part = "(?:(?:$net_path|$abs_path)(?:[?]$query)?)"; $relativeURI = "(?:(?:$net_path|$abs_path|$rel_path)(?:[?]$query)?"; $absoluteURI = "(?:$scheme:(?:$hier_part|$opaque_part))"; $URI_reference = "(?:(?:$absoluteURI|$relativeURI)?(?:#$fragment)?)"; 1; __END__ =pod =head1 NAME Regexp::Common::URI::RFC2396 -- Definitions from RFC2396; =head1 SYNOPSIS use Regexp::Common::URI::RFC2396 qw /:ALL/; =head1 DESCRIPTION This package exports definitions from RFC2396. It's intended usage is for Regexp::Common::URI submodules only. Its interface might change without notice. =head1 REFERENCES =over 4 =item B<[RFC 2396]> Berners-Lee, Tim, Fielding, R., and Masinter, L.: I. August 1998. =back =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/RFC2806.pm000644 000765 000024 00000017622 12654447171 022527 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::RFC2806; use Regexp::Common::URI::RFC1035 qw /$domain/; use Regexp::Common::URI::RFC2396 qw /$unreserved $escaped $hex/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; use vars qw /@EXPORT @EXPORT_OK %EXPORT_TAGS @ISA/; use Exporter (); @ISA = qw /Exporter/; my %vars; BEGIN { $vars {low} = [qw /$dtmf_digit $wait_for_dial_tone $one_second_pause $pause_character $visual_separator $phonedigit $escaped_no_dquote $quoted_string $token_char $token_chars/]; $vars {parts} = [qw /$future_extension/]; $vars {connect} = [qw /$provider_hostname $provider_tag $service_provider $private_prefix $local_network_prefix $global_network_prefix $network_prefix/]; $vars {phone} = [qw /$phone_context_ident $phone_context_tag $area_specifier $post_dial $isdn_subaddress $t33_subaddress $local_phone_number $local_phone_number_no_future $base_phone_number $global_phone_number $global_phone_number_no_future $telephone_subscriber $telephone_subscriber_no_future/]; $vars {fax} = [qw /$fax_local_phone $fax_local_phone_no_future $fax_global_phone $fax_global_phone_no_future $fax_subscriber $fax_subscriber_no_future/]; $vars {modem} = [qw //]; } use vars map {@$_} values %vars; @EXPORT = (); @EXPORT_OK = map {@$_} values %vars; %EXPORT_TAGS = (%vars, ALL => [@EXPORT_OK]); # RFC 2806, URIs for tel, fax & modem. $dtmf_digit = "(?:[*#ABCD])"; $wait_for_dial_tone= "(?:w)"; $one_second_pause = "(?:p)"; $pause_character = "(?:[wp])"; # wait_for_dial_tone | one_second_pause. $visual_separator = "(?:[\\-.()])"; $phonedigit = "(?:[0-9\\-.()])"; # DIGIT | visual_separator $escaped_no_dquote = "(?:%(?:[01]$hex)|2[013-9A-Fa-f]|[3-9A-Fa-f]$hex)"; $quoted_string = "(?:%22(?:(?:%5C(?:$unreserved|$escaped))|" . "$unreserved+|$escaped_no_dquote)*%22)"; # It is unclear wether we can allow only unreserved # characters to unescaped, or can we also use uric # characters that are unescaped? Or pchars? $token_char = "(?:[!'*\\-.0-9A-Z_a-z~]|" . "%(?:2[13-7ABDEabde]|3[0-9]|4[1-9A-Fa-f]|" . "5[AEFaef]|6[0-9A-Fa-f]|7[0-9ACEace]))"; # Only allowing unreserved chars to be unescaped. $token_chars = "(?:(?:[!'*\\-.0-9A-Z_a-z~]+|" . "%(?:2[13-7ABDEabde]|3[0-9]|4[1-9A-Fa-f]|" . "5[AEFaef]|6[0-9A-Fa-f]|7[0-9ACEace]))*)"; $future_extension = "(?:;$token_chars" . "(?:=(?:(?:$token_chars(?:[?]$token_chars)?)|" . "$quoted_string))?)"; $provider_hostname = $domain; $provider_tag = "(?:tsp)"; $service_provider = "(?:;$provider_tag=$provider_hostname)"; $private_prefix = "(?:(?:[!'E-OQ-VX-Z_e-oq-vx-z~]|" . "(?:%(?:2[124-7CFcf]|3[AC-Fac-f]|4[05-9A-Fa-f]|" . "5[1-689A-Fa-f]|6[05-9A-Fa-f]|" . "7[1-689A-Ea-e])))" . "(?:[!'()*\\-.0-9A-Z_a-z~]+|" . "(?:%(?:2[1-9A-Fa-f]|3[AC-Fac-f]|" . "[4-6][0-9A-Fa-f]|7[0-9A-Ea-e])))*)"; $local_network_prefix = "(?:[0-9\\-.()*#ABCDwp]+)"; $global_network_prefix = "(?:[+][0-9\\-.()]+)"; $network_prefix = "(?:$global_network_prefix|$local_network_prefix)"; $phone_context_ident = "(?:$network_prefix|$private_prefix)"; $phone_context_tag = "(?:phone-context)"; $area_specifier = "(?:;$phone_context_tag=$phone_context_ident)"; $post_dial = "(?:;postd=[0-9\\-.()*#ABCDwp]+)"; $isdn_subaddress = "(?:;isub=[0-9\\-.()]+)"; $t33_subaddress = "(?:;tsub=[0-9\\-.()]+)"; $local_phone_number= "(?:[0-9\\-.()*#ABCDwp]+$isdn_subaddress?" . "$post_dial?$area_specifier" . "(?:$area_specifier|$service_provider|" . "$future_extension)*)"; $local_phone_number_no_future = "(?:[0-9\\-.()*#ABCDwp]+$isdn_subaddress?" . "$post_dial?$area_specifier" . "(?:$area_specifier|$service_provider)*)"; $fax_local_phone = "(?:[0-9\\-.()*#ABCDwp]+$isdn_subaddress?" . "$t33_subaddress?$post_dial?$area_specifier" . "(?:$area_specifier|$service_provider|" . "$future_extension)*)"; $fax_local_phone_no_future = "(?:[0-9\\-.()*#ABCDwp]+$isdn_subaddress?" . "$t33_subaddress?$post_dial?$area_specifier" . "(?:$area_specifier|$service_provider)*)"; $base_phone_number = "(?:[0-9\\-.()]+)"; $global_phone_number = "(?:[+]$base_phone_number$isdn_subaddress?" . "$post_dial?" . "(?:$area_specifier|$service_provider|" . "$future_extension)*)"; $global_phone_number_no_future = "(?:[+]$base_phone_number$isdn_subaddress?" . "$post_dial?" . "(?:$area_specifier|$service_provider)*)"; $fax_global_phone = "(?:[+]$base_phone_number$isdn_subaddress?" . "$t33_subaddress?$post_dial?" . "(?:$area_specifier|$service_provider|" . "$future_extension)*)"; $fax_global_phone_no_future = "(?:[+]$base_phone_number$isdn_subaddress?" . "$t33_subaddress?$post_dial?" . "(?:$area_specifier|$service_provider)*)"; $telephone_subscriber = "(?:$global_phone_number|$local_phone_number)"; $telephone_subscriber_no_future = "(?:$global_phone_number_no_future|" . "$local_phone_number_no_future)"; $fax_subscriber = "(?:$fax_global_phone|$fax_local_phone)"; $fax_subscriber_no_future = "(?:$fax_global_phone_no_future|" . "$fax_local_phone_no_future)"; 1; __END__ =pod =head1 NAME Regexp::Common::URI::RFC2806 -- Definitions from RFC2806; =head1 SYNOPSIS use Regexp::Common::URI::RFC2806 qw /:ALL/; =head1 DESCRIPTION This package exports definitions from RFC2806. It's intended usage is for Regexp::Common::URI submodules only. Its interface might change without notice. =head1 REFERENCES =over 4 =item B<[RFC 2616]> Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P. and Berners-Lee, Tim: I. June 1999. =back =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/tel.pm000644 000765 000024 00000005370 12654447171 022356 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::tel; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC2806 qw /$telephone_subscriber $telephone_subscriber_no_future/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $tel_scheme = 'tel'; my $tel_uri = "(?k:(?k:$tel_scheme):(?k:$telephone_subscriber))"; my $tel_uri_nf = "(?k:(?k:$tel_scheme):(?k:$telephone_subscriber_no_future))"; register_uri $tel_scheme => $tel_uri; pattern name => [qw (URI tel)], create => $tel_uri ; pattern name => [qw (URI tel nofuture)], create => $tel_uri_nf ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::tel -- Returns a pattern for telephone URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{tel}/ and print "Contains a telephone URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{tel} Returns a pattern that matches I URIs, as defined by RFC 2806. Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The scheme. =item $3 The phone number, including any possible add-ons like ISDN subaddress, a post dial part, area specifier, service provider, etc. =back =head2 C<$RE{URI}{tel}{nofuture}> As above (including what's returned by C<{-keep}>), with the exception that I are not allowed. Without allowing those I, it becomes much easier to check a URI if the correct syntax for post dial, service provider, phone context, etc has been used - otherwise the regex could always classify them as a I. =head1 REFERENCES =over 4 =item B<[RFC 1035]> Mockapetris, P.: I. November 1987. =item B<[RFC 2396]> Berners-Lee, Tim, Fielding, R., and Masinter, L.: I. August 1998. =item B<[RFC 2806]> Vaha-Sipila, A.: I. April 2000. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/telnet.pm000644 000765 000024 00000004413 12654447171 023062 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::telnet; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC1738 qw /$user $password $host $port/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $telnet_uri = "(?k:(?k:telnet)://(?:(?k:(?k:$user)(?::(?k:$password))?)\@)?" . "(?k:(?k:$host)(?::(?k:$port))?)(?k:/)?)"; register_uri telnet => $telnet_uri; pattern name => [qw (URI telnet)], create => $telnet_uri, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::telnet -- Returns a pattern for telnet URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{telnet}/ and print "Contains a telnet URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{telnet} Returns a pattern that matches I URIs, as defined by RFC 1738. Telnet URIs have the form: "telnet:" "//" [ user [ ":" password ] "@" ] host [ ":" port ] [ "/" ] Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The scheme. =item $3 The username:password combo, or just the username if there is no password. =item $4 The username, if given. =item $5 The password, if given. =item $6 The host:port combo, or just the host if there's no port. =item $7 The host. =item $8 The port, if given. =item $9 The trailing slash, if any. =back =head1 REFERENCES =over 4 =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/tv.pm000644 000765 000024 00000003751 12654447171 022224 0ustar00abigailstaff000000 000000 # TV URLs. # Internet draft: draft-zigmond-tv-url-03.txt package Regexp::Common::URI::tv; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC2396 qw /$hostname/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $tv_scheme = 'tv'; my $tv_url = "(?k:(?k:$tv_scheme):(?k:$hostname)?)"; register_uri $tv_scheme => $tv_url; pattern name => [qw (URI tv)], create => $tv_url, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::tv -- Returns a pattern for tv URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{tv}/ and print "Contains a tv URI.\n"; } =head1 DESCRIPTION =head2 C<$RE{URI}{tv}> Returns a pattern that recognizes TV uris as per an Internet draft [DRAFT-URI-TV]. Under C<{-keep}>, the following are returned: =over 4 =item $1 The entire URI. =item $2 The scheme. =item $3 The host. =back =head1 REFERENCES =over 4 =item B<[DRAFT-URI-TV]> Zigmond, D. and Vickers, M: I. December 2000. =item B<[RFC 2396]> Berners-Lee, Tim, Fielding, R., and Masinter, L.: I. August 1998. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut Regexp-Common-2016020301/lib/Regexp/Common/URI/wais.pm000644 000765 000024 00000004642 12654447171 022536 0ustar00abigailstaff000000 000000 package Regexp::Common::URI::wais; use Regexp::Common qw /pattern clean no_defaults/; use Regexp::Common::URI qw /register_uri/; use Regexp::Common::URI::RFC1738 qw /$host $port $search $database $wtype $wpath/; use strict; use warnings; use vars qw /$VERSION/; $VERSION = '2016020301'; my $scheme = 'wais'; my $uri = "(?k:(?k:$scheme)://(?k:$host)(?::(?k:$port))?/(?k:(?k:$database)" . "(?k:[?](?k:$search)|/(?k:$wtype)/(?k:$wpath))?))"; register_uri $scheme => $uri; pattern name => [qw (URI WAIS)], create => $uri, ; 1; __END__ =pod =head1 NAME Regexp::Common::URI::wais -- Returns a pattern for WAIS URIs. =head1 SYNOPSIS use Regexp::Common qw /URI/; while (<>) { /$RE{URI}{WAIS}/ and print "Contains a WAIS URI.\n"; } =head1 DESCRIPTION =head2 $RE{URI}{WAIS} Returns a pattern that matches I URIs, as defined by RFC 1738. WAIS URIs have the form: "wais:" "//" host [ ":" port ] "/" database [ ( "?" search ) | ( "/" wtype "/" wpath ) ] Under C<{-keep}>, the following are returned: =over 4 =item $1 The complete URI. =item $2 The I. =item $3 The I. =item $4 The I, if given. =item $5 The I, followed by I or I, if given. =item $6 The I. =item $7 The part following the I if given, including the question mark or slash. =item $8 The I part, if given. =item $9 The I, if given. =item $10 The I, if given. =back =head1 REFERENCES =over 4 =item B<[RFC 1738]> Berners-Lee, Tim, Masinter, L., McCahill, M.: I. December 1994. =back =head1 SEE ALSO L for other supported URIs. =head1 AUTHOR Damian Conway (damian@conway.org) =head1 MAINTAINANCE This package is maintained by Abigail S<(I)>. =head1 BUGS AND IRRITATIONS Bound to be plenty. =head1 LICENSE and COPYRIGHT This software is Copyright (c) 2001 - 2009, Damian Conway and Abigail. This module is free software, and maybe used under any of the following licenses: 1) The Perl Artistic License. See the file COPYRIGHT.AL. 2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2. 3) The BSD Licence. See the file COPYRIGHT.BSD. 4) The MIT Licence. See the file COPYRIGHT.MIT. =cut

is C. =head2 C<$RE{net}{IPv4}{hex}{-sep}> Returns a pattern that matches a valid IP address in "dotted hexadecimal", with the letters C to C capitalized. If C<< -sep=I